A Tale of Two Endians

Page 1

Programming at the Crack of Dawn On How to Crack a Soft-Boiled Egg Or A Tale of Two Endians

It’s early morning as I make my way in to work, suffering through caffeine withdrawal yet I live in mortal fear of the coffee maker and thus am forced to await the talents of those who seem better able to tame the tall metal beast. Somewhere a rooster crows. I look out my fifth floor window and as sun and shadow creep across yonder building I see a pigeon perched on the ledge. Cooing. Close enough. This morning as I contemplate the chores ahead of me I sit and think, while peeling a banana, about a puzzle that had been recently posed by one of my colleagues; a puzzle that had more to do with on which side one cracks open a soft-boiled egg than with how one peels a banana. “What might you expect this function to return?” he asked as he scrawled this little poser on our tiny whiteboard: void f(unsigned short *p) { … *p = 1; } unsigned long v = 0; f((unsigned short*)&v);

It had been crunch time for an entire week as we all struggled day and night and into early morning hours to cut a stable release of our new software. All of us during this


time kept digging up little nuggets such as these, nuggets to which no one would lay claim to having buried, yet were nonetheless welcomed curiosities posed by those who found them, and when you’re eyeball deep in the inevitable bugs and syntax errors that one faces when porting software from one platform to another, any diversion will do. And besides, who are we programmers if not the insatiable solvers of puzzles? Those of us who cared to take a crack at it sat or stood in quiet contemplation, some of us scratching our bald spots on the tops of our heads, or pulling at our graying beards, or just standing with hands in pockets quietly shuffling our feet. And then it hit me. We are, after all, a multi-platform shop, and while at first someone might look at this and say, “One…uh…right?” I instead asked him under which operating system was this tested. It was a puzzle of Endians, both little ones and big ones. Although I have yet to find solid evidence to support this fact, various online sources point out that Big-Endian and Little-Endian, as used when speaking about computer architectures, were derived from the story of Gulliver’s Travels, where the Big-Endians argue that a soft-boiled egg ought to be cracked on the big side, and the opposing political faction argues the opposite. And much like Mr. Swift’s Lilliputians, there apparently rages today a battle in the high-tech industry over a similar political correctness of the big and the small. But instead of eggs we have bytes, instead of the big side of the egg or the little side of the egg, we have the high order bytes and the low order bytes of the memory location. Under the Windows operating system on an Intel box using Visual C++ 6.0, the value v in our example is indeed returned as 1—and of course this is how you pick out the Windows programmers in a crowd. The reason 1 is returned is because in the Intel


processor, multi-byte numbers are stored in Little-Endian, which is to say that the least significant byte is stored in the lowest memory address. Big-Endian architectures such as the SunSPARC processor store things in reverse, in other words the most significant byte is stored in the lowest memory address, so compiling and running this program under Solaris on a SunSPARC results in v being returned as 65536 which poses a problem if the caller of this function is expecting v to be 1 upon return. As I began researching these Endians, I discovered that there are other architectures that are both little and big Endian, which is to say they are Bi-Endian, but here I leave the pleasure of such perverse thoughts for the reader to contemplate. So big deal. We have Big-Endians and Little-Endians, and we’re ignoring those Bi-Endians. Who cares? Shouldn’t the operating system handle this? Or maybe the compiler? What’s the problem? The problem lies in the type conversion that was used when passing to this function a long instead of a short. In the example we take a reference to the unsigned long, convert it to a pointer to an unsigned short, and pass it into the function. By doing this the lowest memory address of v is passed into the function, but it’s the most significant portion of the unsigned long that, under Solaris, will get modified by the function. So when we return we have the value 65536 instead of 1. To illustrate, imagine the following memory address on both a Big-Endian and a Little-Endian architecture, each representing the number 3 as a 32bit, 4 byte integer. Address ======= 00 01 02 03

Big-Endian ========== 00000000 00000000 00000000 00000011

Little-Endian ============= 00000011 00000000 00000000 00000000


As we can see, the least significant bytes, those that store the value of three, are, in the Little-Endian architecture, stored in the lowest memory address whereas the least significant bytes in the Big-Endian architecture are stored in the highest memory address. So, in our example, by casting the address of the unsigned long as a pointer to an unsigned short, we were inadvertently passing in the most significant bytes to the function and thus upon return we no longer get the result we expected to find. We were in fact getting this very result Address ======= 00 01 02 03

Big-Endian ========== 00000001 00000000 00000000 00000000

Little-Endian ============= 00000001 00000000 00000000 00000000

which appears to be harmless until you try to print it or evaluate it, at which point you find yourself at two o’clock in the morning, eyes blood shot, mind reeling, and knee deep in nested functions within your debugger. Had the function taken a pointer to a long or had the programmer instead just declared v as an unsigned short, there would have been no bug in the code to fix, no puzzle to solve and no diversion to pull us from the pit of code-porting chaos and back into reality, if only just for a little while. With the banana long peeled and eaten, with dawn having slipped into the late afternoon, and with this particular egg cracked decidedly down the middle—having of course decided to remain neutral in this particular battle of the intellects—I leave you with the hope that I have made your software porting chores, in as much as they concern the issues of the Endian, a little easier to bear.


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.