Forums

Code size reduction migrating from PIC18 to Cortex M0

Started by Kvik May 24, 2012
Hi

We are digging deeper into the Cortex M0 processor versus a PIC18.

Seemingly objective material (Coremark data) at page 32 of:

http://ics.nxp.com/literature/presentations/microcontrollers/pdf/cortex.m0.code.density.pdf

List a reduction in code size from PIC18 to M0 by a factor 2.

But, anyone with a real-life experience of the possible code size
reduction?

Thanks

Klaus


Kvik wrote:

> Hi > > We are digging deeper into the Cortex M0 processor versus a PIC18. > > Seemingly objective material (Coremark data) at page 32 of: > > http://ics.nxp.com/literature/presentations/microcontrollers/pdf/cortex.m0.code.density.pdf > > List a reduction in code size from PIC18 to M0 by a factor 2. > > But, anyone with a real-life experience of the possible code size > reduction?
I have written code generators for both processors. This specific powerpoint has been around for a while and show more about what the Cortex is good at and the Microchip PIC's less so. Cortex is smaller than PIC18 in some embedded applications that require 32 bit math. In many applications Cortex has higher RAM requirements.
This codesize reduction from PIC to Cortex-M seems about right.

The 8-bit PICs have lousy code density according to many studies. In
my own comparison (see my blog post "Insects of Computer World" at
http://embeddedgurus.com/state-space/2009/03/insects-of-the-computer-world/),
I've got something like factor of 5 (!) code size difference between
PIC18 and Cortex-M3. This was for a small RTOS-like state machine
framework source code. The PIC18 code was created by the free student-
edition of the Microchip C18 compiler. I suspect that the payed
edition can do somewhat better code size optimization.

But the truth remains that the 8-bit PICs have the worst code density
in the industry. Also, contrary to widespread misconceptions, the 8-
bitters are not inherently efficient in memory usage. It turns out
that code density has nothing to do with the the register file width
(8-, 16-, or 32-bits), but how old a given CPU design is. The old
designs, such as the 8-bit PIC and 8051 are lousy. New designs,
regardless of the register size are much better. ARM Cortex-M are
pretty good. So is MSP430. But I think that the current winner in
terms of best code density could be the new Renesas RX.

Miro Samek
www.state-machine.com

On 25 Maj, 04:11, Miro Samek <sa...@quantum-leaps.com> wrote:
> This codesize reduction from PIC to Cortex-M seems about right. > > The 8-bit PICs have lousy code density according to many studies. In > my own comparison (see my blog post "Insects of Computer World" athttp://embeddedgurus.com/state-space/2009/03/insects-of-the-computer-...), > I've got something like factor of 5 (!) code size difference between > PIC18 and Cortex-M3. This was for a small RTOS-like state machine > framework source code. The PIC18 code was created by the free student- > edition of the Microchip C18 compiler. I suspect that the payed > edition can do somewhat better code size optimization. > > But the truth remains that the 8-bit PICs have the worst code density > in the industry. Also, contrary to widespread misconceptions, the 8- > bitters are not inherently efficient in memory usage. It turns out > that code density has nothing to do with the the register file width > (8-, 16-, or 32-bits), but how old a given CPU design is. The old > designs, such as the 8-bit PIC and 8051 are lousy. New designs, > regardless of the register size are much better. ARM Cortex-M are > pretty good. So is MSP430. But I think that the current winner in > terms of best code density could be the new Renesas RX. > > Miro Samekwww.state-machine.com
Hi Miro Thats a great link, thankyou very much :-) I will take some representative code on a PIC18 and compare that to the M3 and post the results back in the forum, just for fun Regards Klaus
On 25/05/2012 02:52, Walter Banks wrote:
> > > Kvik wrote: > >> Hi >> >> We are digging deeper into the Cortex M0 processor versus a PIC18. >> >> Seemingly objective material (Coremark data) at page 32 of: >> >> http://ics.nxp.com/literature/presentations/microcontrollers/pdf/cortex.m0.code.density.pdf >> >> List a reduction in code size from PIC18 to M0 by a factor 2. >> >> But, anyone with a real-life experience of the possible code size >> reduction? > > I have written code generators for both processors. This specific > powerpoint has been around for a while and show more about > what the Cortex is good at and the Microchip PIC's less so.
These sorts of things are always written with a purpose in mind. There are three sorts of lies...
> > Cortex is smaller than PIC18 in some embedded applications > that require 32 bit math. In many applications Cortex has higher > RAM requirements. >
When looking at ram requirements, it's worth noting the ratio of flash to ram sizes on common devices. Typically, microcontrollers with 8-bit cores have much more flash per byte of ram than those with 32-bit cores. Though there is obviously lots of variation and different types, it is common for 8-bit devices to have 8 to 32 times as much flash as ram, while for 32-bit devices the range is perhaps 2 to 8. This means that the bigger ram requirements caused by things like 32-bit values being more common than 8-bit, pointers moving from 16-bit to 32-bit, and greater stack space requirements for functions and interrupts, have little impact in real-world usage.
On 25/05/2012 07:50, Kvik wrote:
> On 25 Maj, 04:11, Miro Samek <sa...@quantum-leaps.com> wrote: >> This codesize reduction from PIC to Cortex-M seems about right. >> >> The 8-bit PICs have lousy code density according to many studies. In >> my own comparison (see my blog post "Insects of Computer World" athttp://embeddedgurus.com/state-space/2009/03/insects-of-the-computer-...), >> I've got something like factor of 5 (!) code size difference between >> PIC18 and Cortex-M3. This was for a small RTOS-like state machine >> framework source code. The PIC18 code was created by the free student- >> edition of the Microchip C18 compiler. I suspect that the payed >> edition can do somewhat better code size optimization. >> >> But the truth remains that the 8-bit PICs have the worst code density >> in the industry. Also, contrary to widespread misconceptions, the 8- >> bitters are not inherently efficient in memory usage. It turns out >> that code density has nothing to do with the the register file width >> (8-, 16-, or 32-bits), but how old a given CPU design is. The old >> designs, such as the 8-bit PIC and 8051 are lousy. New designs, >> regardless of the register size are much better. ARM Cortex-M are >> pretty good. So is MSP430. But I think that the current winner in >> terms of best code density could be the new Renesas RX. >> >> Miro Samekwww.state-machine.com > > Hi Miro > > Thats a great link, thankyou very much :-) > > I will take some representative code on a PIC18 and compare that to > the M3 and post the results back in the forum, just for fun > > Regards > > Klaus
As Dave Brown's reply implies, when comparing any attribute about any product, the manufacturers own data is the last place to look. You are always best off taking your own measurements for the use case you are actually interested in - because the results are use case specific. Following Miro's email - the free version of the PIC compilers does not (normally) include the best optimisation, unless it is during its evaluation period. Regards, Richard. + http://www.FreeRTOS.org Designed for microcontrollers. More than 7000 downloads per month. + http://www.FreeRTOS.org/trace 15 interconnected trace views. An indispensable productivity tool.

David Brown wrote:

> On 25/05/2012 02:52, Walter Banks wrote: > > > > > > Kvik wrote: > > > >> Hi > >> > >> We are digging deeper into the Cortex M0 processor versus a PIC18. > >> > >> Seemingly objective material (Coremark data) at page 32 of: > >> > >> http://ics.nxp.com/literature/presentations/microcontrollers/pdf/cortex.m0.code.density.pdf > >> > >> List a reduction in code size from PIC18 to M0 by a factor 2. > >> > >> But, anyone with a real-life experience of the possible code size > >> reduction? > > > > I have written code generators for both processors. This specific > > powerpoint has been around for a while and show more about > > what the Cortex is good at and the Microchip PIC's less so. > > These sorts of things are always written with a purpose in mind. There > are three sorts of lies... > > > > > Cortex is smaller than PIC18 in some embedded applications > > that require 32 bit math. In many applications Cortex has higher > > RAM requirements. > > > > When looking at ram requirements, it's worth noting the ratio of flash > to ram sizes on common devices. Typically, microcontrollers with 8-bit > cores have much more flash per byte of ram than those with 32-bit cores. > Though there is obviously lots of variation and different types, it is > common for 8-bit devices to have 8 to 32 times as much flash as ram, > while for 32-bit devices the range is perhaps 2 to 8. This means that > the bigger ram requirements caused by things like 32-bit values being > more common than 8-bit, pointers moving from 16-bit to 32-bit, and > greater stack space requirements for functions and interrupts, have > little impact in real-world usage.
Good point about rom/ram ratio's Your numbers are consistent with our experience. To take the point forward the high rom/ram ratios on many small 8 bit micros changes the way code is generated for these parts by trading ram savings for rom and execution cycles. Walter..
On 25/05/2012 13:45, Walter Banks wrote:
> > > David Brown wrote: > >> On 25/05/2012 02:52, Walter Banks wrote: >>> >>> >>> Kvik wrote: >>> >>>> Hi >>>> >>>> We are digging deeper into the Cortex M0 processor versus a PIC18. >>>> >>>> Seemingly objective material (Coremark data) at page 32 of: >>>> >>>> http://ics.nxp.com/literature/presentations/microcontrollers/pdf/cortex.m0.code.density.pdf >>>> >>>> List a reduction in code size from PIC18 to M0 by a factor 2. >>>> >>>> But, anyone with a real-life experience of the possible code size >>>> reduction? >>> >>> I have written code generators for both processors. This specific >>> powerpoint has been around for a while and show more about >>> what the Cortex is good at and the Microchip PIC's less so. >> >> These sorts of things are always written with a purpose in mind. There >> are three sorts of lies... >> >>> >>> Cortex is smaller than PIC18 in some embedded applications >>> that require 32 bit math. In many applications Cortex has higher >>> RAM requirements. >>> >> >> When looking at ram requirements, it's worth noting the ratio of flash >> to ram sizes on common devices. Typically, microcontrollers with 8-bit >> cores have much more flash per byte of ram than those with 32-bit cores. >> Though there is obviously lots of variation and different types, it is >> common for 8-bit devices to have 8 to 32 times as much flash as ram, >> while for 32-bit devices the range is perhaps 2 to 8. This means that >> the bigger ram requirements caused by things like 32-bit values being >> more common than 8-bit, pointers moving from 16-bit to 32-bit, and >> greater stack space requirements for functions and interrupts, have >> little impact in real-world usage. > > Good point about rom/ram ratio's Your numbers are consistent > with our experience. To take the point forward the high rom/ram > ratios on many small 8 bit micros changes the way code is generated > for these parts by trading ram savings for rom and execution cycles. >
Yes, ram is "cheaper" on many 32-bit devices than 8-bit devices. On the other hand, sometimes /accessing/ ram is more expensive (maybe you can't do direct addressing but must first load a pointer register, maybe you've got ram that takes multiple cpu clock cycles, maybe you have code running out of ram as well). If it were easy getting these balances right, your job wouldn't be half as much fun! Having more ram on hand also changes the way users write code, and gives the programmer more freedom.
On May 24, 5:32=A0pm, Kvik <klaus.kragel...@gmail.com> wrote:
> Hi > > We are digging deeper into the Cortex M0 processor versus a PIC18. > > Seemingly objective material (Coremark data) at page 32 of: > > http://ics.nxp.com/literature/presentations/microcontrollers/pdf/cort... > > List a reduction in code size from PIC18 to M0 by a factor 2. > > But, anyone with a real-life experience of the possible code size > reduction? > > Thanks > > Klaus
I've ported a fairly large app from a PIC18 to a Cortex M3 (which I believe is just a superset of the M0) and the code size actually INCREASED from about 62K to 129K. This was just plain C code without any processor-specific optimizations or tricks that was just cut & pasted from one compiler to the other. While the Cortex does get better density on things like 32 X 32 multiplies or divides it suffers horribly on simple control structures. For example, clearing a timer interrupt flag: On the PIC18 this takes 2 bytes: PIR1 &=3D ~TMR1IF; 2108: BCF F9E.0 On the Cortex M3 it takes 40 bytes: TIM1->SR &=3D ~TIM_SR_UIF; F6424200 movw r2, #0x2C00 F2C40201 movt r2, #0x4001 F6424300 movw r3, #0x2C00 F2C40301 movt r3, #0x4001 8A1B ldrh r3, [r3, #16] B29B uxth r3, r3 4619 mov r1, r3 F64F73FE movw r3, #0xFFFE F2C00300 movt r3, #0 EA010303 and.w r3, r1, r3 4619 mov r1, r3 460B mov r3, r1 8213 strh r3, [r2, #16] A simple countdown: On the PIC18 it takes 6 bytes: if (--timeout) return; 210A: DECF x3B,F 210C: BZ 2110 210E: BRA 2114 On the Cortex M3 it takes 40 bytes: if (--timeout) return; F2400360 movw r3, #0x60 F2C20300 movt r3, #0x2000 7B5B ldrb r3, [r3, #13] F10333FF add.w r3, r3, #0xFFFFFFFF B2DA uxtb r2, r3 F2400360 movw r3, #0x60 F2C20300 movt r3, #0x2000 735A strb r2, [r3, #13] F2400360 movw r3, #0x60 F2C20300 movt r3, #0x2000 7B5B ldrb r3, [r3, #13] 2B00 cmp r3, #0 D128 bne 0x08000F92 This may not be a very fair comparison since both compilers (CCS for the PIC and gcc for the Cortex) are set to non-optimized mode but even when gcc is set to optimize it only drops from 129K down to 104K which is not much of a savings and still worse than the PIC18. When I first started this exercise I was quite disappointed by the poor density so I tried a simple exercise: I took one single C function that had more than doubled in size and re-wrote it so as to take advantage of the Cortex strengths. I made heavy use of 32-bit variables, careful use of the "register" keyword, always accessing global variables through a pointer, combining bit shifts with other arithmetic operations, using bit-banding for IO registers wherever possible, etc. In the end I managed to get it down to almost half its size, but still couldn't match the PIC18. Perhaps the final answer depends on what kind of application you're writing. In my case it's very IO intensive with a lot of peripherals being used and a simple touchscreen UI with very little math involved. Perhaps the Cortex was not the best choice here.
On 20/06/2012 18:07, peter_gotkatov@supergreatmail.com wrote:

> I've ported a fairly large app from a PIC18 to a Cortex M3 (which I > believe is just a superset of the M0) and the code size actually > INCREASED from about 62K to 129K.
Did you look at the map file to see why? If using GCC, did you set the compile options to remove dead code (most linkers will do it automatically). If using GCC, did you avoid using libraries that were written for a much larger class of processor?
> For example, clearing a timer interrupt flag: > > On the PIC18 this takes 2 bytes: > PIR1 &= ~TMR1IF; > 2108: BCF F9E.0 > > On the Cortex M3 it takes 40 bytes: > TIM1->SR &= ~TIM_SR_UIF; > F6424200 movw r2, #0x2C00 > F2C40201 movt r2, #0x4001 > F6424300 movw r3, #0x2C00 > F2C40301 movt r3, #0x4001 > 8A1B ldrh r3, [r3, #16] > B29B uxth r3, r3 > 4619 mov r1, r3 > F64F73FE movw r3, #0xFFFE > F2C00300 movt r3, #0 > EA010303 and.w r3, r1, r3 > 4619 mov r1, r3 > 460B mov r3, r1 > 8213 strh r3, [r2, #16]
"The Cortex-M3 processor has a feature known as "bit-banding". This allows an individual bit in a memory-mapped mailbox or peripheral register to be set/cleared by a single store/load instruction to an bit-band aliased memory address, rather than using a conventional read/modify/write instruction sequence." Regards, Richard. + http://www.FreeRTOS.org Designed for microcontrollers. More than 7000 downloads per month. + http://www.FreeRTOS.org/trace 15 interconnected trace views. An indispensable productivity tool.