On 2007-02-15, David Brown <david@westcontrol.removethisbit.com> wrote:> 16-bit cpus, for the same reasons. The AVR is not bad for an > eight-bitter, although they made a couple of terrible design decisions > regarding C compatibility. The PPC takes a lot of getting used to, andI would be interested in hearing more if you would like to elaborate on what you think those are.
PIC vs ARM assembler (no flamewar please)
Started by ●February 14, 2007
Reply by ●February 18, 20072007-02-18
Reply by ●February 19, 20072007-02-19
"Wilco Dijkstra" <Wilco_dot_Dijkstra@ntlworld.com> writes:> "Everett M. Greene" <mojaveg@mojaveg.lsan.mdsg-pacwest.com> wrote > > David Brown <david@westcontrol.removethisbit.com> writes: > > >> I don't have a problem with a processor taking three or four > >> instructions to code an operation like that - on most cpus where you can > >> have complex addressing modes, the instructions take many clock cycles > >> and many instruction words, and could therefore be easily split into > >> several instructions. > > Also these instructions could be scheduled out of a loop, or interleaved > with other instructions so that they execute faster than any complex > instruction. > > >> I dislike architectures where you need several > >> instructions to do the simplest of things, like loading a register with > >> a constant, but I see no benefit in trying to do everything possible > >> within a single instruction. > > Some of this can be alleviated with smart assemblers. There is no > hard and fast rule that says an assembler must produce a single > instruction for each mnemonic. Simple instruction selection and > expansion are commonly featured in assemblers. > > > The ARM design is unusual for a RISC in being three-address > > and having conditional execution of nearly all its instruc- > > tions. This tends to waste 8 bits of every 32-bit instruc- > > tion. Surely, they could have found better uses for those > > bits [and don't call me Shirley, "Airplane", ca. 1970]. > > I'm curious what better use you have for those 8 bits? Another 8 bits > are "wasted" on the shifter for example, so you could save 16 bits > per instruction. The result is... Thumb. > > But since 16-bit instructions do less (3 operands, shifts and > conditional execution are quite common),Common, yes, but not for every instruction. The most common instruction is always execute, no shift, and one of the source regs is the same as the destination.> Thumb needs around 30% more instructions to do the same amount > of work as ARM.True, but 4/3 of 16 bits = 2/3 of 32 bits, itself an indication of how much instruction space is wasted in the ARM design.> Thumb-2 introduces the IT instruction which makes the following 1-4 > instructions conditional. This form of conditional execution wastes fewer > bits overall even though it uses more bits per conditional instruction > (it is still smaller and faster than using branches of course).
Reply by ●February 19, 20072007-02-19
"Everett M. Greene" <mojaveg@mojaveg.lsan.mdsg-pacwest.com> wrote in message news:20070218.7A4DC68.1367A@mojaveg.lsan.mdsg-pacwest.com...> "Wilco Dijkstra" <Wilco_dot_Dijkstra@ntlworld.com> writes: >> "Everett M. Greene" <mojaveg@mojaveg.lsan.mdsg-pacwest.com> wrote>> > The ARM design is unusual for a RISC in being three-address >> > and having conditional execution of nearly all its instruc- >> > tions. This tends to waste 8 bits of every 32-bit instruc- >> > tion. Surely, they could have found better uses for those >> > bits [and don't call me Shirley, "Airplane", ca. 1970]. >> >> I'm curious what better use you have for those 8 bits? Another 8 bits >> are "wasted" on the shifter for example, so you could save 16 bits >> per instruction. The result is... Thumb. >> >> But since 16-bit instructions do less (3 operands, shifts and >> conditional execution are quite common), > > Common, yes, but not for every instruction. The most common > instruction is always execute, no shift, and one of the source > regs is the same as the destination.Yes, but a pure 2-operand instruction set requires additional moves to avoid overwriting the destination register. Thumb has various 3-opnd instructions to reduce this.>> Thumb needs around 30% more instructions to do the same amount >> of work as ARM. > > True, but 4/3 of 16 bits = 2/3 of 32 bits, itself an > indication of how much instruction space is wasted in the > ARM design.Correct, the information content of ARM instructions is around 19-20 bits per instruction. However you can't get there using a fixed length encoding, so neither ARM nor Thumb are optimal. Thumb-2 uses mixed 16/32-bit encodings to get the best of both worlds. You can do better still with significantly more complex encodings, and get down to about 50% of ARM codesize. However it's unlikely it is worth the hardware complexity. RISC focuses more on easy decoding... Wilco
Reply by ●February 19, 20072007-02-19
Wilco Dijkstra wrote:> "Everett M. Greene" <mojaveg@mojaveg.lsan.mdsg-pacwest.com> wrote: >... snip ...>> >> Common, yes, but not for every instruction. The most common >> instruction is always execute, no shift, and one of the source >> regs is the same as the destination. > > Yes, but a pure 2-operand instruction set requires additional > moves to avoid overwriting the destination register. Thumb has > various 3-opnd instructions to reduce this.What this really does is require a dedicated temporary register, which is no longer available to the programmer. The temporary may be virtual. -- <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt> <http://www.securityfocus.com/columnists/423> "A man who is right every time is not likely to do very much." -- Francis Crick, co-discover of DNA "There is nothing more amazing than stupidity in action." -- Thomas Matthews
Reply by ●February 19, 20072007-02-19
"Wilco Dijkstra" <Wilco_dot_Dijkstra@ntlworld.com> writes:> "Everett M. Greene" <mojaveg@mojaveg.lsan.mdsg-pacwest.com> wrote > > "Wilco Dijkstra" <Wilco_dot_Dijkstra@ntlworld.com> writes: > >> "Everett M. Greene" <mojaveg@mojaveg.lsan.mdsg-pacwest.com> wrote > > >> > The ARM design is unusual for a RISC in being three-address > >> > and having conditional execution of nearly all its instruc- > >> > tions. This tends to waste 8 bits of every 32-bit instruc- > >> > tion. Surely, they could have found better uses for those > >> > bits [and don't call me Shirley, "Airplane", ca. 1970]. > >> > >> I'm curious what better use you have for those 8 bits? Another 8 bits > >> are "wasted" on the shifter for example, so you could save 16 bits > >> per instruction. The result is... Thumb. > >> > >> But since 16-bit instructions do less (3 operands, shifts and > >> conditional execution are quite common), > > > > Common, yes, but not for every instruction. The most common > > instruction is always execute, no shift, and one of the source > > regs is the same as the destination. > > Yes, but a pure 2-operand instruction set requires additional > moves to avoid overwriting the destination register.I suspect that memory-to-memory operations are more common and the load/store RISC requires a temporary register for this.> Thumb has various 3-opnd instructions to reduce this.Thumb? I thought all the three-operand instructions were removed from Thumb.> >> Thumb needs around 30% more instructions to do the > >> same amount of work as ARM. > > > > True, but 4/3 of 16 bits = 2/3 of 32 bits, itself an > > indication of how much instruction space is wasted in the > > ARM design. > > Correct, the information content of ARM instructions is > around 19-20 bits per instruction. However you can't get > there using a fixed length encoding, so neither ARM nor > Thumb are optimal. Thumb-2 uses mixed 16/32-bit encodings > to get the best of both worlds. > > You can do better still with significantly more complex > encodings, and get down to about 50% of ARM codesize. > However it's unlikely it is worth the hardware complexity.> RISC focuses more on easy decoding...True to a degree, but ease of decoding doesn't require three-operand instructions, conditional execution of every instruction, etc.
Reply by ●February 20, 20072007-02-20
"rickman" <gnuarm@gmail.com> skrev i meddelandet news:1171639930.653456.17030@t69g2000cwt.googlegroups.com...> On Feb 16, 7:39 am, "Ulf Samuelsson" <u...@a-t-m-e-l.com> wrote: >> <ucad...@gmail.com> skrev i >> meddelandetnews:1171495132.249289.198340@v33g2000cwv.googlegroups.com... >> >> >> >> >> >> >> >> > Had a discussion with a _hardware_ guy (as in transistors and OP-amps) >> > about "powerful" micros. >> >> > He his a PIC guy and claimed that PIC have a very nice instruction set >> > and is a pleasure to work with in assembly. He also mentioned the he >> > would rather use a dsPIC instead of an ARM7 because ARM7 is very hard >> > to program and has a confusing assembly (we never talked application, >> > so I assume he meant this holds regardless of application). He also >> > said that another major advantage of dsPIC is that its a PIC, hence >> > the know-how and toolchain advantage... >> >> > Completely shocked, I told him that my experience was the exact >> > opposite, and I really enjoy ARM assembler (well, maybe not enjoy...). >> > Anyway, after that, the discussion turned into a flamewar... >> >> > So what do you say? Maybe I have been wrong all the time? >> >> > What do you guys think about the instruction set and architecture, >> > provided that you were forced to code in assembly and we ignored the >> > fact that these is more of an apples vs pink-flying-elephants >> > comparison... >> >> > (you can also include your background and your other favorite micros >> > such as AVR and MSP4xx, but_ please_ don't flame. and you must REALLY >> > HAVE WORKED with all of them, no gusses please :) ) >> >> > ((yes, I REALLY do want your answers. Because I suspect the answer >> > will differ very much dependent on your background, and experience and >> > your application, and I think that information would benefit this >> > little community)) >> >> > -shocked >> >> The Series 32000 instruction set is way superior to anything mentioned so >> far. >> The MC68000 instructon set is a murky wannabee in comparision. >> >> Try doing this in a single instruction on another architecture... >> >> pointer1->field1[ix1] = (unsigned int) pointer2->field2[ix2]; >> >> maps to: >> >> movzbd field1(pointer1(sb))[r0:d], field2(pointer2(sp))[r3:d] >> >> Elegance is Everything! >> >> -- >> Best Regards >> Ulf Samuelsson > > Yeah, but how many days does it take to execute? > > ;^)The NS32532 was about two times the speed of the x86/68k competition of its day 68030. The NS32764 (later Swordfish) was one of the first Superscalar RISC processors and would execute code faster than any of the other RISC processors in that time period. It would decode the instruction into risc instructions before execution. The final version of course skipped the 32000 instruction set altogether.> > >
Reply by ●February 20, 20072007-02-20
"CBFalconer" <cbfalconer@yahoo.com> skrev i meddelandet news:45D5C2B7.7AAA7B51@yahoo.com...> Ulf Samuelsson wrote: >> > ... snip ... >> >> The Series 32000 instruction set is way superior to anything >> mentioned so far. The MC68000 instructon set is a murky wannabee >> in comparision. >> >> Try doing this in a single instruction on another architecture... >> >> pointer1->field1[ix1] = (unsigned int) pointer2->field2[ix2]; >> >> maps to: >> >> movzbd field1(pointer1(sb))[r0:d], field2(pointer2(sp))[r3:d] >> >> Elegance is Everything! > > Is it? I contend that the space required to hold all those fields > in one instruction makes the generic instruction too big, and would > prefer to execute two (or more) instructions. In a stack machine > there will not even be a temporary needed. As a practicing > dinosaur I believe in compact object code. For example, a stack > machine could execute: > > lda field1; TOS := load address > ldn ix; TOS := ix > idx n; TOS := n*TOS + TOS > lda field2 > ldn ix > idx n > ld TOS := TOS^ > sto; TOS^ := TOS > > and only lda need be larger than a byte. This also makes a good > intermediate language if it is desired to port to a wide > instruction machine, such as your 32000 above. The set also has > great piping possibilities. > > --You code does not do all the things the instruction above does. Surprisingly enough, I think that complete instruction can fit into 8-9 bytes depending on the length of the displacement. This is due to the encoded immediates in the Series32000 architecture. It supports 7,14 and 30 bit immediates like this (IIRC). 0xxx_xxxx 10xx_xxxx_xxxx_xxxx 11xx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx -- Best Regards, Ulf Samuelsson This is intended to be my personal opinion which may, or may not be shared by my employer Atmel Nordic AB
Reply by ●February 20, 20072007-02-20
Terran Melconian wrote:> On 2007-02-15, David Brown <david@westcontrol.removethisbit.com> wrote: >> 16-bit cpus, for the same reasons. The AVR is not bad for an >> eight-bitter, although they made a couple of terrible design decisions >> regarding C compatibility. The PPC takes a lot of getting used to, and > > I would be interested in hearing more if you would like to elaborate on > what you think those are.I presume you are referring to the AVR here, rather than my PPC comments. There are, I think, four main areas where the AVR is poor regarding C. First, it is 8-bit, while C requires 16 bits in many areas. There are a couple of 16-bit instructions, such as movw instructions in later cores, but a few more such instructions could have greatly reduced the overhead of using 16-bit data on an 8-bit cpu. Second, the split memory space makes it virtually impossible to write a compiler that is standards compliant, efficient, and easy to use regarding access to constant data in flash. Various solutions include introducing new non-standard "flash" keywords, abusing existing "const" keywords, and using user-unfriendly macros. The banking system for the 128 kB and up AVRs makes it even worse. Third, there are only three pointer registers, only two of which support displacements. This is just too limited for C, and leads to thrashing for code that uses pointers, structs, or arrays. Fourth, the stack pointer is accessible only as an external register using 8-bit access. This leads to a lot of overhead for stack frame access, and means that one of the pointer registers is needed as a frame pointer. Small functions can keep their parameters and variables in registers, but more complex code requires stack data. It is issue 2 above that is the real killer - the others cause big, slow and inefficient code, while the split memory space stops code being re-usable across compilers and targets. None of this stops the AVR from being a very nice family of microcontrollers - it is just that they are not nearly as "C friendly" as their marketing people would like to claim. mvh., David
Reply by ●February 20, 20072007-02-20
Wilco Dijkstra wrote: <snip>> Correct, the information content of ARM instructions is around 19-20 bits > per instruction. However you can't get there using a fixed length encoding, > so neither ARM nor Thumb are optimal. Thumb-2 uses mixed 16/32-bit > encodings to get the best of both worlds. >That's beginning to sound like the ColdFire (which Freescale refers to as a "variable instruction length RISC processor").> You can do better still with significantly more complex encodings, and get > down to about 50% of ARM codesize. However it's unlikely it is worth the > hardware complexity. RISC focuses more on easy decoding... > > Wilco > >
Reply by ●February 20, 20072007-02-20
"David Brown" <david@westcontrol.removethisbit.com> wrote in message news:45dae320$0$24609$8404b019@news.wineasy.se...> Wilco Dijkstra wrote:>> Correct, the information content of ARM instructions is around 19-20 bits >> per instruction. However you can't get there using a fixed length >> encoding, >> so neither ARM nor Thumb are optimal. Thumb-2 uses mixed 16/32-bit >> encodings to get the best of both worlds. > > That's beginning to sound like the ColdFire (which Freescale refers to as > a "variable instruction length RISC processor").Not really. ColdFire is a 68K variant removing some of the less frequently used instructions and complex addressing modes. Although this allows for simpler and faster implementations, the instruction set remains as CISCy as the 68K. It's all marketing... There are few RISCs with variable length instructions. Wilco