EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Endianness does not apply to byte

Started by karthikbg November 17, 2006
rTrenado wrote:
> I am sooooo confused now!, why would Little Endian be perverse, > annoying & such?!?!?!
It's not really. It's just that most languages arrange numbers from left to right (even Arabic or Hebrew where words are normally right to left). This is when writing numbers; when speaking numbers many languages go the other way or mix things up. "Four and Twenty." (This may be why Arabic only looks like it switches order when doing numbers, when they're really being written least-precision first) Some languages are top to bottom also. Big endian is only natural to people where left-to-right and highest precision first is natural. Computer memory doesn't have lefts or rights or tops or bottoms. -- Darin Johnson
Rob Windgassen wrote:
> AFAIK the mask in rlwinm instructions requires the begin and end mask bit > indices to be specified.
Yeah, after looking it up, that's right. I always considered the bit positions to be "amount to rotate". Ie, rotate the original register as normal, then generate the mask by also rotating 1's and 0's (which may be what occurs internally). The masks are always constants as well. Practically speaking though, this "endianness" is irrelevant to data interchange or portability, unlike byte-endianness. -- Darin Johnson
Grant Edwards wrote:
> On 2006-11-17, DJ Delorie <dj@delorie.com> wrote: >> "karthikbg" <karthik.balaguru@lntinfotech.com> writes: >>> Why did Little-Endian come ? >> My guess is compatibility with previous processors. I.e. emulating >> an 8 bit cpu on a 16 bit LE cpu might be easier than on a 16 bit BE >> cpu. > > Little-endian can be a bit simpler for the hardware design > and/or compiler since the address you put on the bus when you > read a variable doesn't change depending on the destination > type. > > If variable X is a 32-bit integer at address 0x1234, reading it > as a long, char or short always generates address 0x1234. For a > big-endian machine, the address changes depending on how you > want to read the variable. Reading it as a char generates > address 0x1237. Reading it as a short generates 0x1236. > > Not a huge deal, but back in the day gates were more expensive. >
My understanding was that big-endian is easier to implement in hardware, at least for a modern processor with a wide bus and load-store architecture. AFAIK, most RISC processors use big-endian, and for those that support both (like the PPC and MIPS), little-endian is very much a second-class citizen, and is dropped on smaller implementations. In the old days, when most processors were 8-bit, using little-endian format could make a big difference if you were dealing with multi-byte data. If you are trying to do arithmetic on 32-bit (or more) numbers, then little-endian ordering makes a lot of sense, so the usage probably stems from the days when 8-bit processors were used for number crunching and financial work.
Grant Edwards wrote:
> On 2006-11-17, David Brown <david@westcontrol.removethisbit.com> wrote: > >>> [1] There are some processors that can address bits within a >>> register with certain instructions. All the ones I've see >>> call the LSB bit "0". I've never seen such bit-addressing >>> made visible in a high-level language -- except possibly >>> PL/M-51 from Intel (I never actually wrote in PL/M-51, and >>> have rather vague memories of it). The 8051 had a nifty >>> feature where there was a block of memory that was >>> bit-addressible. I don't remember if the bit addressing >>> was big-endian or little-endian >>> >>> I have seen documentation (IBM?) where the register >>> desriptions label the MSB of a register as bit 0, but >>> I've not run into any hardware that works that way. >> Yes, IBM refers to bit 0 as the MSB. Thus if you have a >> powerpc device (from IBM or Freescale), and look the pin >> numbering, D0 is the MSB of the databus and D31 is the LSB. > > Is that convention visible to the programmer in any way? IOW, > at the assembly language level are there instructions that use > an integer "bit index" as an operand? > > A completely fictitious example where bit "endianess" would > be visible: > > BITSET R0,#0 // Sets MSB of R0 > > MOV R1,#31 > BITCLR R0,R1 // Clars LSB of R0 >
Yes, that's about right for the PPC. You *really* want to avoid assembly programming on that thing. I've only used a PPC device on one project, but I even wrote the crt0 startup code in C after the initial "load stack pointer, jump to start code" assembly code. The convention is also visible to anyone reading the user manuals, and to the guy drawing out the schematics. The scope for error here is enormous.
> Off the top of my head, I can think of one CPU that has > instructions like that (Hitachi H8), and it's "little-endian" > when it comes to bit-ordering (bit 0 is LSB), even though it's > "big-endian" when it comes to byte ordering (the most > significant byte of a multibyte integeger has the lowest > address). >
David Brown wrote:
> Yes, that's about right for the PPC. You *really* want to avoid > assembly programming on that thing.
PPC is pretty easy to use and learn, if you ignore the rotate and mask instructions (even then you can just keep a guide handy). I certainly found it a lot less mysterious than many other assemblers and easier to use. Lots of registers, a very orthagonal instruction set, load/store with simple addressing modes, no LEA, etc. -- Darin Johnson
David Brown <david@westcontrol.removethisbit.com> writes:
> Grant Edwards wrote: > > On 2006-11-17, DJ Delorie <dj@delorie.com> wrote: > >> "karthikbg" <karthik.balaguru@lntinfotech.com> writes: > >>> Why did Little-Endian come ? > >> My guess is compatibility with previous processors. I.e. emulating > >> an 8 bit cpu on a 16 bit LE cpu might be easier than on a 16 bit BE > >> cpu. > > > > Little-endian can be a bit simpler for the hardware design > > and/or compiler since the address you put on the bus when you > > read a variable doesn't change depending on the destination > > type. > > > > If variable X is a 32-bit integer at address 0x1234, reading it > > as a long, char or short always generates address 0x1234. For a > > big-endian machine, the address changes depending on how you > > want to read the variable. Reading it as a char generates > > address 0x1237. Reading it as a short generates 0x1236. > > > > Not a huge deal, but back in the day gates were more expensive. > > My understanding was that big-endian is easier to implement in hardware, > at least for a modern processor with a wide bus and load-store > architecture.
That's a good point. It must require a lot of strange internal wiring to reverse the byte order between the bus and registers if the data is Little-Endian.
> AFAIK, most RISC processors use big-endian, and for those > that support both (like the PPC and MIPS), little-endian > is very much a second-class citizen, and is dropped on > smaller implementations.
I asked a question (of this newsgroup?) regarding the choice of endiness made when it's an option and got not a single response. I wish somebody would have strongly told the Indians doing an ARM compiler that Little-Endian and Srecords is not the proper combination to arbitrarily assume!
> In the old days, when most processors were 8-bit, using little-endian > format could make a big difference if you were dealing with multi-byte > data. If you are trying to do arithmetic on 32-bit (or more) numbers, > then little-endian ordering makes a lot of sense, so the usage probably > stems from the days when 8-bit processors were used for number crunching > and financial work.
I'm not convinced that it makes that much difference even in the cases you cite. Whether you increment a pointer to the data or decrement it makes little difference in the hardware.
Op Mon, 20 Nov 2006 10:56:37 PST schreef Everett M. Greene:

<snip>
>> My understanding was that big-endian is easier to implement in hardware, >> at least for a modern processor with a wide bus and load-store >> architecture. > > That's a good point. It must require a lot of strange > internal wiring to reverse the byte order between the > bus and registers if the data is Little-Endian.
Why, a simple inversion of the lowest address lines will do. -- Coos
"karthikbg" <karthik.balaguru@lntinfotech.com> wrote in message 
news:1163759863.554265.81790@e3g2000cwe.googlegroups.com...
> > I find that regardless of what the byte Endian order is, the bits > within > bytes are always in Big-Endian format. > > For instance, 1 is always stored in a byte as 000000012 no matter what > platform.
The premise is fatally flawed. Bytes are neither big-endian nor little-endian ... that is human prejudice and the accepted human practice of writing the most significant digit first. There is nothing in a CPU that says you must write 1010 binary for 10 decimal (rather than 0101). In a lot of applications, little-endian storage makes sense. For example, in the GMP, it makes the most sense to store large integers least-significant limb first (there are algorithmic advantages). It also makes more logical sense. For bytes, b[n] is the bit corresponding to 2^n. For bytes, it would also make the most sense if B[N] is the byte corresponding to 2^(8*N).
David T. Ashley wrote:

> The premise is fatally flawed.
Actually the original flaw is in the number system we use in everyday life. The numbers we use are actually written bass-ackwards, and have been ever since they were imported from Arabia to Central Europe. Look at how you pronounce a multi-digit number you see printed on paper. Before you can pronounce the first couple of digits, you have to count how many digits there are. I.e. you have to jump back and forth in the line: This number is 3939586729 is scanned T h i s n u m b e r i s [... wait, how many digits??? 10, OK, so it's...] three billion (939) million (586) thousand 729. It's really quite ridiculous. Wouldn't it be a whole lot easier if the number were written 9276859393 and pronounced: ninetwoseven sixeightfivethousand ninethreeninemillion threebillion ? No jumping all over the place to count digits, no kids wondering why they have to add starting at the "back" end, in elementary school. Little-endian would have made sense in pretty much every aspect of arithmetic, long before "computer" turned from being the name for a person doing arithmetic for a living, to that of a machine. But that's water under the bridge, and has been for about 1000 years now.
"Hans-Bernhard Br&#4294967295;ker" <HBBroeker@t-online.de> wrote in message 
news:ejtevr$9n3$00$1@news.t-online.com...
> David T. Ashley wrote: > >> The premise is fatally flawed. > > Actually the original flaw is in the number system we use in everyday > life. The numbers we use are actually written bass-ackwards, and have > been ever since they were imported from Arabia to Central Europe. > > Look at how you pronounce a multi-digit number you see printed on paper. > Before you can pronounce the first couple of digits, you have to count how > many digits there are. I.e. you have to jump back and forth in the line: > > This number is 3939586729 > > is scanned T h i s n u m b e r i s [... wait, how many digits??? 10, OK, > so it's...] three billion (939) million (586) thousand 729. It's really > quite ridiculous. Wouldn't it be a whole lot easier if the number were > written > > 9276859393
I actually thought about the argument you made before I made my post (that one has to count the number of digits before one can figure out what the first digit means). I think human beings are always looking for an abstraction (i.e. to reduce the amount of information without changing its significance). In that sense, big-endian seems more human-friendly. One can often ignore the last several digits. The population of a country might be 297,322,101 ... one can really stop reading after the second digit. Also, in speaking, with some numbers, saying them in a little-endian way would seem to "delay" the most significant stuff. I can tolerate a language like German because it only does this for the modulo ten result, i.e. neun-und-neunzig ... but it would drive me crazy if this little-endian discipline were carried out through the whole number. But my ramblings only apply to humans ... the GMP (for example) is fast for a reason ...

Memfault Beyond the Launch