Forums

ARM Cortex M3 - Who's utilizing it?

Started by diggerdo February 17, 2006
Wilco Dijkstra wrote:
> A quick scan of the AVR8 CPUs revealed that the ATtiny2313/V > seems to be the lowest power AVR at 0.41 mW/Mhz. > Cortex-M3 uses 3.5 times less power...
That is the classic error of comparing a new (future) core-only figure, with an existing full-system product. A grapes to pomegranate comparison. There is a _lot_ more than just the core, that determines the system Icc values. ASIC core vendors tend to over-look that, as that's not what they sell. Even the IC vendors nudge the goal posts, by specing their uC data with external square wave clocks. Nice way to ignore the XTAL Oscillator Amplifier & Buffer current effects...
> Additionally a 32-bit CPU can do a lot more work per cycle, so they > run at a lower frequency or sleep for longer. So a higher performance > CPU that uses more power may actually use less *energy* to do a > specific task.
Perhaps if we compare a raw core, with a raw core ? So, let's see how the Cortex compares, with the new ARM Async Core ? Cortex M3 = appx 90uW/MHz ARM996HS [New Clockless, Async technology core ] = 45uW/MHz [These numbers come from the same company, so should be free of inter-company-skew effects.... ?] Hmmm - wonder how those (two?) Cortex licensees feel about that ? A spec of Energy per task is a very good one, and overdue on uC designs. -jg
"Jim Granville" <no.spam@designtools.co.nz> wrote in message 
news:43f7a49d$1@clear.net.nz...
> Wilco Dijkstra wrote: >> A quick scan of the AVR8 CPUs revealed that the ATtiny2313/V >> seems to be the lowest power AVR at 0.41 mW/Mhz. >> Cortex-M3 uses 3.5 times less power... > > That is the classic error of comparing a new (future) core-only figure, > with an existing full-system product. A grapes to pomegranate comparison.
Nope. The ATtiny2313/V number above is based on simulation like the M3 figure. The Cortex-M3 figure includes the standard peripherals that are part of the core. In most cases power consumption is measured while running a benchmark such as Dhrystone, so no peripherals are used. With a process tuned for power the leakage current of the peripherals would be minimal.
> There is a _lot_ more than just the core, that determines the > system Icc values. > ASIC core vendors tend to over-look that, as that's not what they > sell.
Peripherals only consume power if you enable and use them. But even then most don't use much power, eg. a UART running at 100K baud still uses a fraction of a core at 10MHz.
>> Additionally a 32-bit CPU can do a lot more work per cycle, so they >> run at a lower frequency or sleep for longer. So a higher performance >> CPU that uses more power may actually use less *energy* to do a >> specific task. > > Perhaps if we compare a raw core, with a raw core ? > > So, let's see how the Cortex compares, with the new ARM Async Core ? > > Cortex M3 = appx 90uW/MHz > ARM996HS [New Clockless, Async technology core ] = 45uW/MHz > > [These numbers come from the same company, so should be free of > inter-company-skew effects.... ?]
No. You forgot to take into account the process geometry. The Cortex-M3 number is for 180nm, the 996HS for 130nm. According to datapoints for the similar ARM946E-S, power consumption improves by a factor of 3 to 3.5 on a 180nm process. So Cortex-M3 would still win by a good margin. Maybe we will get a Cortex-M3HS too?
> Hmmm - wonder how those (two?) Cortex licensees feel about that ?
There is 4 of them btw. I'm sure they are still happy - there are lots of reasons for using the M3.
> A spec of Energy per task is a very good one, and overdue on uC > designs.
Indeed. Wilco
Wilco Dijkstra wrote:

 >>> Additionally a 32-bit CPU can do a lot more work per cycle, so they
 >>> run at a lower frequency or sleep for longer. So a higher performance
 >>> CPU that uses more power may actually use less *energy* to do a
 >>> specific task.
 >>
 >>
 >> Perhaps if we compare a raw core, with a raw core ?
 >>
 >> So, let's see how the Cortex compares, with the new ARM Async Core ?
 >>
 >> Cortex M3                                         = appx 90uW/MHz
 >> ARM996HS [New Clockless, Async technology core ]  = 45uW/MHz
 >>
 >> [These numbers come from the same company, so should be free of
 >> inter-company-skew effects.... ?]
 >
 >
 >
 > No. You forgot to take into account the process geometry.


..but no more than your Tiny2313 <-> Cortex comparison

 > The Cortex-M3 number is for 180nm, the 996HS for 130nm. According to 
datapoints for
 > the similar ARM946E-S, power consumption improves by a factor of
 > 3 to 3.5 on a 180nm process.


mA/MHz can improve, but the Static Icc effects are starting to bite at 
those gemoetries, so often the focus has to shift from scaled speed, to
clawing back some of the precious lost static uA...

 > So Cortex-M3 would still win by a good
 > margin. Maybe we will get a Cortex-M3HS too?


Yes, a Cortex-M3HS would be an interesting device.
Especially with the right Flash speed, and peripheral mix..

( tho it might confuse the market, with two M3 variants... )

-jg

Wilco Dijkstra wrote:
>> Don't think any one plans to put a Cortex-A8 in a smart card >> which is one very obvious application for the AVR32... > > I don't think anyone sane is going to put the AVR32 there. As I said, > it is an ARM11 class core, so totally unsuitable for smartcards > (it doesn't even have a rotate instruction which is essential for > cryptography). Maybe there will be a smaller low power version > eventually but that wasn't mentioned. >
You have ARM7s in the current high end smartcards. I believe the AVR32 has the JAVA funcitonality precisely for this application. -- Best Regards, Ulf Samuelsson ulf@a-t-m-e-l.com This message is intended to be my own personal view and it may or may not be shared by my employer Atmel Nordic AB
Jim Granville wrote:
> Wilco Dijkstra wrote: >> A quick scan of the AVR8 CPUs revealed that the ATtiny2313/V >> seems to be the lowest power AVR at 0.41 mW/Mhz. >> Cortex-M3 uses 3.5 times less power... > > That is the classic error of comparing a new (future) core-only > figure, with an existing full-system product. A grapes to pomegranate > comparison. >
Well stated!
> > -jg
-- Best Regards, Ulf Samuelsson ulf@a-t-m-e-l.com This message is intended to be my own personal view and it may or may not be shared by my employer Atmel Nordic AB
"Jim Granville" <no.spam@designtools.co.nz> wrote in message 
news:43f7d549@clear.net.nz...
> Wilco Dijkstra wrote: > > > No. You forgot to take into account the process geometry. > > ..but no more than your Tiny2313 <-> Cortex comparison
The datasheets didn't give the process, however this page gives some hints: http://www.atmel.com/dyn/products/ip_param_table.asp?family_id=615 All 180nm libraries use 1.8V like the Tiny2313, so it is likely 180nm.
> > The Cortex-M3 number is for 180nm, the 996HS for 130nm. According to > datapoints for > > the similar ARM946E-S, power consumption improves by a factor of > > 3 to 3.5 on a 180nm process. > > mA/MHz can improve, but the Static Icc effects are starting to bite at > those gemoetries, so often the focus has to shift from scaled speed, to > clawing back some of the precious lost static uA...
180nm isn't nearly as bad as 90nm... But it matters mostly when sleeping, that is why there are various sleep states that power down large parts of the chip (at the cost of slower wakeup). Voltage scaling may be affected too, it is better to run at a slightly higher frequency than running at a lower voltage/frequency for longer (and thus use more static current).
> > So Cortex-M3 would still win by a good > > margin. Maybe we will get a Cortex-M3HS too? > > Yes, a Cortex-M3HS would be an interesting device. > Especially with the right Flash speed, and peripheral mix.. > > ( tho it might confuse the market, with two M3 variants... )
True, it might be possible to take advantage of asynchronous logic, such as optimizing for the average rather than worst case (eg. use ripple carry adders instead of lookahead). This would allow for even smaller sizes without a large performance penalty. Wilco
Wilco Dijkstra wrote:
> "Jim Granville" <no.spam@designtools.co.nz> wrote in message > news:43f7d549@clear.net.nz... > >>Wilco Dijkstra wrote: >> >> >>>No. You forgot to take into account the process geometry. >> >>..but no more than your Tiny2313 <-> Cortex comparison > > > The datasheets didn't give the process, however this page gives some > hints: http://www.atmel.com/dyn/products/ip_param_table.asp?family_id=615 > All 180nm libraries use 1.8V like the Tiny2313, so it is likely 180nm.
The Tiny2313 is a 1.8-5.5V process, so I very much doubt it is 180nm, more likely 0,35um. Ulf will know ? :) -jg
In article <dt6s9g$ao4$1@nntp.aioe.org>, Ulf Samuelsson <ulf@a-t-m-e-
l.com> writes
>Wilco Dijkstra wrote: >> "Ulf Samuelsson" <ulf@a-t-m-e-l.com> wrote in message >> news:dt5q6v$g3$1@nntp.aioe.org... >>> D. wrote: >>> T.I has licensed the Cortex-A8, but I think this is for Nokia and >>> alike... The AVR32 seems to run at higher frequency and has an MMU >>> so they may be focusing on different markets. >> >> Yes, the AVR32 is definitely not in the same market as the M3. It is >> an ARM11 + Jazelle + Thumb-2 clone, but because it is late (MIPS did >> it >> a few years ago), it now will have to compete with Cortex-A8. Ouch... > >Don't think any one plans to put a Cortex-A8 in a smart card
Given the nature of the smart card business (paranoid) you are hardly likely to know yet.. Besides I think the cortex is aimed at a different market. -- \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ \/\/\/\/\ Chris Hills Staffs England /\/\/\/\/ /\/\/ chris@phaedsys.org www.phaedsys.org \/\/\ \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
"Jim Granville" <no.spam@designtools.co.nz> wrote in message 
news:43f79040$1@clear.net.nz...
> Wilco Dijkstra wrote: >> Interestingly it turns out Atmel's marketing department has been >> working overtime - their benchmarking figures are obviously bogus. >> >> They chose to compare against the i.MX21/i.MX31 numbers using >> GCC (not the fastest compiler around by a large margin) and present >> them as official ARM926 and ARM1136 numbers. For the codesize >> results they chose EEMBC figures, however the EEMBC codesize >> figures optimized for performance are totally meaningless. > > Are you surprised ?
Yes, because it's quite brazen and they don't even try to hide it. Do they really think anyone would take them seriously with such wild claims? The 3x speedup over ARM is complete nonsense. The benchmarks in question are floating point, and it is hardly surprising a CPU with an FPU outperforms one that uses emulation. With an FPU the ARM part becomes 4x faster. Where is that revolutionary performance lead now? The documents don't mention floating point anywhere (no FP instructions either, only mention of an optional FPU in the user guide), so it looks like they are misleading on purpose.
> All marketing departments are desperate to make their offerings > look good, so they choose their leading-edge, against the others > trailing edge, and then are selective as well.
They must be very deperate then :-)
> I use a general nudge factor of 2:1 in filtering market droid fluff. > > If they cannot claim a difference of more than one generation in > performance, then it is not revolutionary, and probably merely > comparable with the 'other guys' next release anyway...
Almost everything has been done before, not much chance for skipping a generation. You would need to build a 1+ Ghz 2-way out-of-order chip, and compete head on with PowerPC / x86 Geode. Even the Itanium didn't turn out to be much of a revolution... Wilco
"Wilco Dijkstra" <Wilco_dot_Dijkstra@ntlworld.com> writes:

> Since Thumb-2 is a reencoding of ARM instructions, any existing ARM > assembler can be assembled to Thumb-2 with minimal effort.
Almost, but not quite. As well as a number of additions there are a few omissions. -- Jim Garside