Compare ARM MCU Vendors| page 3

Reply by D Yuniskis ●September 16, 20102010-09-16

Hi Dave,

Dave Graffio wrote:
> How do you compare ARM MCU manufacturers for a project in the USA?

Like you would any other vendor!  See who has what you want/need.
How much they want for it.  What their reputation is.  etc.

Then, see who *else* has "something that you can *tweek*" to do
the same job -- possibly better/worse -- and repeat the process.

Finally, make a "value judgement" on all of the candidates that
fall through the above process.

> I see Atmel, St micro, nxp, Texas instr, Freescale, Marvell - are they all selling the 
> same stuff or is there any real difference? I see St has faster parts but Atmel has more 
> of them. Is price and support all the same?

I don't think you will find "the same part" from any two vendors.
The ARM world is like the "stereo" (HiFi) business of ages past
(modern parallel would be multimedia):  you bought a turntable
from vendor A, the *stylus* for that turntable from vendor B,
the (phono) preamp from vendor C, amplifier from vendor D,
speakers from vendor E, etc.  Until you got the "system" that fit
your price/performance/ego.

With ARM, each vendor *packages* various "components" (referencing
the above analogy) into an MCU.  So, the processing power of the
"core", amount of memory (and flavors thereof) included/supported,
other peripherals onboard, etc. varies.

In theory, you can find The Ideal MCU for your application -- but,
chances are, it is only sold by *one* vendor (though the various
components inside it may appear in a smattering of offerings from
other vendors... though not in the exact same configuration).

> Google doesn't seem to show any information anywhere on this, which is really shocking.

I dunno... google doesn't tell me which *car* is "right" for me,
either!  Amazing!

> I am wondering if I should move between them or standardize on one company.

That's a value judgement.  Do you want to establish a relationship with
*one* company?  (there are pros and cons, of course)  Do you want to
tailor your solutions to your problems (or pick the closest fit from
the offerings of that *one* company)?

That's why they call it "Engineering" instead of "shopping for shoes"...

--don

Reply by rickman ●September 17, 20102010-09-17

On Sep 16, 11:08 am, D Yuniskis <not.going.to...@seen.com> wrote:
> Hi Dave,
>
> Dave Graffio wrote:
> > Google doesn't seem to show any information anywhere on this, which is really shocking.
>
> I dunno... google doesn't tell me which *car* is "right" for me,
> either!  Amazing!

Google is just a way to search for info that others provide.  There
are some comparisons of ARM devices, at one time I put up a comparison
of ARM7 devices myself.  But it is very hard to keep updated.  Now
ARM7 is on the down slope and the Cortex architectures are the hot,
new thing.  Good luck trying to keep up with all the new product
introductions there!  I have a lot of things on my plate before I
could do this, but I may take a stab at another comparison chart for
ARM CM devices over the winter.  The last one was done when I needed
the info for myself and I may need to evaluate ARM cores again soon.

> > I am wondering if I should move between them or standardize on one company.
>
> That's a value judgement.  Do you want to establish a relationship with
> *one* company?  (there are pros and cons, of course)  Do you want to
> tailor your solutions to your problems (or pick the closest fit from
> the offerings of that *one* company)?

The reasons for standardizing on one company in the (distant) past had
to do with the differences in CPU architectures.  Once you learned the
PIC12 devices you didn't want to restart with the MSP430 parts, both
learning about the CPU as well as the tools.  Now that ARM has
provided a more complete MCU core in the CM3/1/0 the CPUs are nearly
all the same eliminating this issue.  But the peripherals are very
different between brands.  The tools have a lot more support for the
peripherals.  So there is still reason for developing a brand
loyalty.  At some point I expect the tool vendors may have reason to
help mitigate this issue and it will be much easier to port between
brands.  But that may be a long time off, if ever.

If you think you will have many designs that need a variety of MCU
parts with different capabilities, I would suggest that you consider
the major players with broad product lines.  This can save cost on
tools as well as relearning how to use the peripherals.  It should
cost you little in terms of recurring part costs or a match to your
application.  In other words, don't switch vendors unless you have a
reason.

Rick

Reply by Ulf Samuelsson ●September 20, 20102010-09-20

David Brown skrev:
> On 13/09/2010 17:05, Ulf Samuelsson wrote:
>> linnix skrev:
>>>> As for speed, most current flash based ARMs/CM3s are limited by
>>>> flash waitstates and there is very little performance increase
>>>> once you reach those 60-70 MHz.
>>>>
>>>> While a CM3 at zero waitstates is 1.25 Dhrystone MIPS/MHz,
>>>> the performance at 100 MHz with 4 waitstates is more like
>>>> 0,85 Dhrystone MIPS/MHz.
>>>>
>>>> Using a Keil compiler, I measured the difference between
>>>> running at 84 MHz and 96 Mhz to be less than 1%.
>>>> This was more than one year ago, but I doubt that this will change
>>>> until people start to put faster flash memories into the products.
>>>>
>>>
>>> Or putting instruction cache there.
>>
>> That might work but I am not aware of this solution beeing implemented
>> anywhere.
>>
>> It was not neccessarily a good idea with the ARM7.
>> If you added an insruction cache you
>> added 1 waitstate to all accesses.
>>
>> Good for top performance on some apps, but certainly
>> reduced the worst case performance, which sometimes
>> is more important.
>>
> 
> A better solution for micros like that is a wider flash design with an 
> sram buffer in the flash module - that is certainly how some 
> manufacturers handle the problem.  It is a simpler solution than a full 
> instruction cache because you have only a single "tag" (or perhaps two, 
> if you have two such buffers), and there are no issues with coherence or 
> anything else.  The buffer of perhaps 256 bytes gets filled whenever you 
> access a new "page" in the flash, so that the processor then reads from 
> the buffer rather than directly from the flash.  And if space/economics 
> allow, you have have a wider flash-to-buffer bus to keep up a high 
> bandwidth even with slow flash and a fast processor.
> 

The disadvantage of having a 256 byte wide memory, is power consumption.
You will have 2048 active sense amplifiers.
I dont see that coming soon.

-- 
Best Regards
Ulf Samuelsson
These are my own personal opinions, which may
or may not be shared by my employer Atmel Nordic AB

Reply by David Brown ●September 20, 20102010-09-20

On 20/09/2010 10:30, Ulf Samuelsson wrote:
> David Brown skrev:
>> On 13/09/2010 17:05, Ulf Samuelsson wrote:
>>> linnix skrev:
>>>>> As for speed, most current flash based ARMs/CM3s are limited by
>>>>> flash waitstates and there is very little performance increase
>>>>> once you reach those 60-70 MHz.
>>>>>
>>>>> While a CM3 at zero waitstates is 1.25 Dhrystone MIPS/MHz,
>>>>> the performance at 100 MHz with 4 waitstates is more like
>>>>> 0,85 Dhrystone MIPS/MHz.
>>>>>
>>>>> Using a Keil compiler, I measured the difference between
>>>>> running at 84 MHz and 96 Mhz to be less than 1%.
>>>>> This was more than one year ago, but I doubt that this will change
>>>>> until people start to put faster flash memories into the products.
>>>>>
>>>>
>>>> Or putting instruction cache there.
>>>
>>> That might work but I am not aware of this solution beeing implemented
>>> anywhere.
>>>
>>> It was not neccessarily a good idea with the ARM7.
>>> If you added an insruction cache you
>>> added 1 waitstate to all accesses.
>>>
>>> Good for top performance on some apps, but certainly
>>> reduced the worst case performance, which sometimes
>>> is more important.
>>>
>>
>> A better solution for micros like that is a wider flash design with an
>> sram buffer in the flash module - that is certainly how some
>> manufacturers handle the problem. It is a simpler solution than a full
>> instruction cache because you have only a single "tag" (or perhaps
>> two, if you have two such buffers), and there are no issues with
>> coherence or anything else. The buffer of perhaps 256 bytes gets
>> filled whenever you access a new "page" in the flash, so that the
>> processor then reads from the buffer rather than directly from the
>> flash. And if space/economics allow, you have have a wider
>> flash-to-buffer bus to keep up a high bandwidth even with slow flash
>> and a fast processor.
>>
>
> The disadvantage of having a 256 byte wide memory, is power consumption.
> You will have 2048 active sense amplifiers.
> I dont see that coming soon.
>

You don't need 256 byte wide memory - you need a 256 byte sram buffer on 
the flash.  If we assume that the processor ideally wants to read 32-bit 
wide data from the flash at 100 MHz, and the flash itself is capable of 
providing data once per cycle at 50 MHz (perhaps with a couple of cycles 
delay for initial access to a page), then the flash-to-buffer width 
should be 64 bits.  Then there is a brief stall when accessing a new 
page, but otherwise the processor gets its instructions at full speed.

Yes, those 64 bits means 64 sense amplifiers, compared to 16 amplifiers 
that might be used on a slower flash setup.  But apart from a small 
leakage current, the amplifiers only take power when they are used, so 
the number of amplifiers doesn't affect the power much - the total power 
is proportional to the bits read from the flash.  With a buffer 
arrangement, you'll get some unnecessary reads to fill the buffer, but 
you'll avoid duplicate reads on many loops - my guess is you'd reduce 
the total number of reads.

Reply by Dave Nadler ●September 20, 20102010-09-20

On Sep 20, 5:10=A0am, David Brown <da...@westcontrol.removethisbit.com>
wrote:
> On 20/09/2010 10:30, Ulf Samuelsson wrote:
>
>
>
> > David Brown skrev:
> >> On 13/09/2010 17:05, Ulf Samuelsson wrote:
> >>> linnix skrev:
> >>>>> As for speed, most current flash based ARMs/CM3s are limited by
> >>>>> flash waitstates and there is very little performance increase
> >>>>> once you reach those 60-70 MHz.
>
> >>>>> While a CM3 at zero waitstates is 1.25 Dhrystone MIPS/MHz,
> >>>>> the performance at 100 MHz with 4 waitstates is more like
> >>>>> 0,85 Dhrystone MIPS/MHz.
>
> >>>>> Using a Keil compiler, I measured the difference between
> >>>>> running at 84 MHz and 96 Mhz to be less than 1%.
> >>>>> This was more than one year ago, but I doubt that this will change
> >>>>> until people start to put faster flash memories into the products.
>
> >>>> Or putting instruction cache there.
>
> >>> That might work but I am not aware of this solution beeing implemente=
d
> >>> anywhere.
>
> >>> It was not neccessarily a good idea with the ARM7.
> >>> If you added an insruction cache you
> >>> added 1 waitstate to all accesses.
>
> >>> Good for top performance on some apps, but certainly
> >>> reduced the worst case performance, which sometimes
> >>> is more important.
>
> >> A better solution for micros like that is a wider flash design with an
> >> sram buffer in the flash module - that is certainly how some
> >> manufacturers handle the problem. It is a simpler solution than a full
> >> instruction cache because you have only a single "tag" (or perhaps
> >> two, if you have two such buffers), and there are no issues with
> >> coherence or anything else. The buffer of perhaps 256 bytes gets
> >> filled whenever you access a new "page" in the flash, so that the
> >> processor then reads from the buffer rather than directly from the
> >> flash. And if space/economics allow, you have have a wider
> >> flash-to-buffer bus to keep up a high bandwidth even with slow flash
> >> and a fast processor.
>
> > The disadvantage of having a 256 byte wide memory, is power consumption=
.
> > You will have 2048 active sense amplifiers.
> > I dont see that coming soon.
>
> You don't need 256 byte wide memory - you need a 256 byte sram buffer on
> the flash. =A0If we assume that the processor ideally wants to read 32-bi=
t
> wide data from the flash at 100 MHz, and the flash itself is capable of
> providing data once per cycle at 50 MHz (perhaps with a couple of cycles
> delay for initial access to a page), then the flash-to-buffer width
> should be 64 bits. =A0Then there is a brief stall when accessing a new
> page, but otherwise the processor gets its instructions at full speed.
>
> Yes, those 64 bits means 64 sense amplifiers, compared to 16 amplifiers
> that might be used on a slower flash setup. =A0But apart from a small
> leakage current, the amplifiers only take power when they are used, so
> the number of amplifiers doesn't affect the power much - the total power
> is proportional to the bits read from the flash. =A0With a buffer
> arrangement, you'll get some unnecessary reads to fill the buffer, but
> you'll avoid duplicate reads on many loops - my guess is you'd reduce
> the total number of reads.

LPC1800... can operate at 150MHz straight from its 1Mbyte flash
memory, or from RAM... The flexible dual-bank 256bit wide flash
memories...

Dual-bank seems to be not for performance - doesn't get the benefit of
512bit width as they aren't interleaved.

See:
http://www.electronicsweekly.com/Articles/2010/09/20/49475/NXP-reveals-150M=
Hz-ARM-Cortex-M3.htm

Interesting trade-off !
Best Regards, Dave

Reply by Ulf Samuelsson ●September 20, 20102010-09-20

2010-09-20 21:04, Dave Nadler skrev:
> On Sep 20, 5:10 am, David Brown<da...@westcontrol.removethisbit.com>
> wrote:
>> On 20/09/2010 10:30, Ulf Samuelsson wrote:
>>
>>
>>
>>> David Brown skrev:
>>>> On 13/09/2010 17:05, Ulf Samuelsson wrote:
>>>>> linnix skrev:
>>>>>>> As for speed, most current flash based ARMs/CM3s are limited by
>>>>>>> flash waitstates and there is very little performance increase
>>>>>>> once you reach those 60-70 MHz.
>>
>>>>>>> While a CM3 at zero waitstates is 1.25 Dhrystone MIPS/MHz,
>>>>>>> the performance at 100 MHz with 4 waitstates is more like
>>>>>>> 0,85 Dhrystone MIPS/MHz.
>>
>>>>>>> Using a Keil compiler, I measured the difference between
>>>>>>> running at 84 MHz and 96 Mhz to be less than 1%.
>>>>>>> This was more than one year ago, but I doubt that this will change
>>>>>>> until people start to put faster flash memories into the products.
>>
>>>>>> Or putting instruction cache there.
>>
>>>>> That might work but I am not aware of this solution beeing implemented
>>>>> anywhere.
>>
>>>>> It was not neccessarily a good idea with the ARM7.
>>>>> If you added an insruction cache you
>>>>> added 1 waitstate to all accesses.
>>
>>>>> Good for top performance on some apps, but certainly
>>>>> reduced the worst case performance, which sometimes
>>>>> is more important.
>>
>>>> A better solution for micros like that is a wider flash design with an
>>>> sram buffer in the flash module - that is certainly how some
>>>> manufacturers handle the problem. It is a simpler solution than a full
>>>> instruction cache because you have only a single "tag" (or perhaps
>>>> two, if you have two such buffers), and there are no issues with
>>>> coherence or anything else. The buffer of perhaps 256 bytes gets
>>>> filled whenever you access a new "page" in the flash, so that the
>>>> processor then reads from the buffer rather than directly from the
>>>> flash. And if space/economics allow, you have have a wider
>>>> flash-to-buffer bus to keep up a high bandwidth even with slow flash
>>>> and a fast processor.
>>
>>> The disadvantage of having a 256 byte wide memory, is power consumption.
>>> You will have 2048 active sense amplifiers.
>>> I dont see that coming soon.
>>
>> You don't need 256 byte wide memory - you need a 256 byte sram buffer on
>> the flash.  If we assume that the processor ideally wants to read 32-bit
>> wide data from the flash at 100 MHz, and the flash itself is capable of
>> providing data once per cycle at 50 MHz (perhaps with a couple of cycles
>> delay for initial access to a page), then the flash-to-buffer width
>> should be 64 bits.  Then there is a brief stall when accessing a new
>> page, but otherwise the processor gets its instructions at full speed.
>>
>> Yes, those 64 bits means 64 sense amplifiers, compared to 16 amplifiers
>> that might be used on a slower flash setup.  But apart from a small
>> leakage current, the amplifiers only take power when they are used, so
>> the number of amplifiers doesn't affect the power much - the total power
>> is proportional to the bits read from the flash.  With a buffer
>> arrangement, you'll get some unnecessary reads to fill the buffer, but
>> you'll avoid duplicate reads on many loops - my guess is you'd reduce
>> the total number of reads.
>
> LPC1800... can operate at 150MHz straight from its 1Mbyte flash
> memory, or from RAM... The flexible dual-bank 256bit wide flash
> memories...
>
> Dual-bank seems to be not for performance - doesn't get the benefit of
> 512bit width as they aren't interleaved.
>
> See:
> http://www.electronicsweekly.com/Articles/2010/09/20/49475/NXP-reveals-150MHz-ARM-Cortex-M3.htm
>
> Interesting trade-off !
> Best Regards, Dave

In practice you see that the 128 flash LPC2xxx draws a lot
more current than the 32 bit SAM7.
In thumb mode, the SAM7 is faster than the LPC (at the same clock 
frequency) due to the faster flash.
The wide flash memories will give you some extra boost at the top
performance level. The programmable nature of the SAM3,
allowed me to test the difference between 64 & 128 bit
and it is ~5%. Normally it is better to increase the clock
than it is to increase the width of the flash.
Same performance, but less power.

-- 
Best Regards
Ulf Samuelsson
These are my own personal opinions, which may (or may not)
be shared by my employer Atmel Nordic AB

Reply by rickman ●September 21, 20102010-09-21

On Sep 3, 12:08=A0am, An Schwob in the USA <schwo...@aol.com> wrote:
> On Sep 2, 7:37=A0pm, "Dave Graffio" <wscra...@yahoo.com> wrote:
>
> > "antedeluvian" wrote...
> > >A really great part is the Cypress PSOC5 which gives a great deal of
> > > flexibilty because of its configurabilty.
>
> > > Unfortunately it appears to be made of pure unobtanium.
>
> > Not true. I've heard it's being designed by the engineering firm of Tut=
tle and Dunsel.
> > (Capt, Retired)
>
> Dave,
>
> you heard strange things such as Luminary (TI) being low quality. They
> manufacture on one of the highest quality production lines in the
> world TSMC. Marvell does not design and manufacture MCUs, they do high
> end application processors, no flash but lots of MHz. Atmel started
> strong with ARM7 and ARM9 but is weak in Cortex-M3, their focus
> shifted very much towards AVR32. NXP offers the fastest Cortex-M3 with
> Flash, btw. did you know that Toshiba has the fastest M3 running from
> internal SDRAM? Did you know that Energy Micro achieve better power
> numbers using the Cortex-M3 then any other vendor even those using
> Cortex-M0?

Not sure what you mean by "Atmel is weak in Cortex-M3".  The CM3 is
new enough that not everyone has their products out yet.  I think
Atmel dilly dallied too long with the CM3, but I expect this was due
to company goal issues and not because of "weakness" of any kind.
They have a competing 32 bit MCU product and I expect they could only
throw so many resources at bringing out a totally new MCU line.  Give
them a few more months and I think they will not disappoint.

"Fastest" is always a short lived title.  Clock speed is seldom a
determining criterion in selecting an MCU and I expect it is often
given too much weight by engineers when initially winnowing their MCU
choices.  It is a simple number that is easy to verify.  CPU speed is
a much more complex measurement that is very hard to verify for your
application, but this is the one that may actually make a difference
in your design.

> PSoC5 is a great product and if your volume production does not start
> before 2011, you might want to order a FirstTouch for PSoC 5, just $49
> free tools, several sensors for acceleration, temperature, capacitive
> touch and readily available. Got one on my desk, like it.http://www.cypre=
ss.com/psoc5is a good place to start.

How can you plan to use a part, even if you can wait six months for
production, if you don't know the price?  Has anyone heard a number
for production pricing on the PSOC5?

> I could write a lot more about ARM / Cortex-MCUs because that's what I
> have been dealing with since the first ARM7 MCUs hit the market. If
> you need professional help, with the selection write an email to
> microcontroller (skip this at gmail) -dod comm
> It would go a long way if you would list your requirements, you get
> better answers.
>
> For a list with many articles about Cortex based MCUs check out this
> one:http://mcu-related.com/architectures/35-cortex-m3

Some three or four years ago I put together a list of ARM7 devices
available.  By the time Luminary came on the scene it got to be too
much work to update.  Now with all the CMx devices out there it would
be a major effort to keep this updated.  Does anyone have a
comprehensive comparison of features and capabilities of the CMx MCUs
available?

Rick

Reply by rickman ●September 21, 20102010-09-21

On Sep 20, 4:30 am, Ulf Samuelsson <u...@a-t-m-e-l.com> wrote:
> David Brown skrev:
> > A better solution for micros like that is a wider flash design with an
> > sram buffer in the flash module - that is certainly how some
> > manufacturers handle the problem.  It is a simpler solution than a full
> > instruction cache because you have only a single "tag" (or perhaps two,
> > if you have two such buffers), and there are no issues with coherence or
> > anything else.  The buffer of perhaps 256 bytes gets filled whenever you
> > access a new "page" in the flash, so that the processor then reads from
> > the buffer rather than directly from the flash.  And if space/economics
> > allow, you have have a wider flash-to-buffer bus to keep up a high
> > bandwidth even with slow flash and a fast processor.
>
> The disadvantage of having a 256 byte wide memory, is power consumption.
> You will have 2048 active sense amplifiers.
> I dont see that coming soon.

I hope you aren't involved in architecting new MCU designs.  I don't
think anyone said they wanted 2048 sense amplifiers.  I would either
interpret the above to be "256 bits" or I would consider an
implementation that used a 256 byte cache of some sort.  What would be
the utility of a 256 byte wide interface to the Flash?  Even the
fastest CM3 CPUs can't run at nearly that speed.

Rick

Reply by David Brown ●September 21, 20102010-09-21

On 21/09/2010 13:16, rickman wrote:
> On Sep 20, 4:30 am, Ulf Samuelsson<u...@a-t-m-e-l.com>  wrote:
>> David Brown skrev:
>>> A better solution for micros like that is a wider flash design with an
>>> sram buffer in the flash module - that is certainly how some
>>> manufacturers handle the problem.  It is a simpler solution than a full
>>> instruction cache because you have only a single "tag" (or perhaps two,
>>> if you have two such buffers), and there are no issues with coherence or
>>> anything else.  The buffer of perhaps 256 bytes gets filled whenever you
>>> access a new "page" in the flash, so that the processor then reads from
>>> the buffer rather than directly from the flash.  And if space/economics
>>> allow, you have have a wider flash-to-buffer bus to keep up a high
>>> bandwidth even with slow flash and a fast processor.
>>
>> The disadvantage of having a 256 byte wide memory, is power consumption.
>> You will have 2048 active sense amplifiers.
>> I dont see that coming soon.
>
> I hope you aren't involved in architecting new MCU designs.  I don't
> think anyone said they wanted 2048 sense amplifiers.  I would either
> interpret the above to be "256 bits" or I would consider an
> implementation that used a 256 byte cache of some sort.  What would be
> the utility of a 256 byte wide interface to the Flash?  Even the
> fastest CM3 CPUs can't run at nearly that speed.
>

I was referring to a 256 byte cache, but perhaps I wasn't clear in my 
description.  Such a page cache will be filled from the flash at a speed 
that suits the flash, with a width that matches the flash (perhaps 
something like 64-bit or even 128-bit for performance-optimised parts, 
and maybe as small as 16-bit for price or power optimised parts).  On 
the other side of the cache, the processor will read out with a speed 
and width that matches its instruction bus - typically 32-bit.

It is effectively a specialised type of instruction cache - less 
flexible, but much simpler to implement.

I've read about such a cache, but I can't remember which chip used it - 
it may not even have been an ARM device (perhaps it was a ColdFire v2 
microcontroller).  And many parts have some sort of "flash accelerator" 
in their feature list, which are probably a similar idea.

Reply by rickman ●September 21, 20102010-09-21

On Sep 21, 7:41=A0am, David Brown <da...@westcontrol.removethisbit.com>
wrote:
> On 21/09/2010 13:16, rickman wrote:
>
>
>
> > On Sep 20, 4:30 am, Ulf Samuelsson<u...@a-t-m-e-l.com> =A0wrote:
> >> David Brown skrev:
> >>> A better solution for micros like that is a wider flash design with a=
n
> >>> sram buffer in the flash module - that is certainly how some
> >>> manufacturers handle the problem. =A0It is a simpler solution than a =
full
> >>> instruction cache because you have only a single "tag" (or perhaps tw=
o,
> >>> if you have two such buffers), and there are no issues with coherence=
 or
> >>> anything else. =A0The buffer of perhaps 256 bytes gets filled wheneve=
r you
> >>> access a new "page" in the flash, so that the processor then reads fr=
om
> >>> the buffer rather than directly from the flash. =A0And if space/econo=
mics
> >>> allow, you have have a wider flash-to-buffer bus to keep up a high
> >>> bandwidth even with slow flash and a fast processor.
>
> >> The disadvantage of having a 256 byte wide memory, is power consumptio=
n.
> >> You will have 2048 active sense amplifiers.
> >> I dont see that coming soon.
>
> > I hope you aren't involved in architecting new MCU designs. =A0I don't
> > think anyone said they wanted 2048 sense amplifiers. =A0I would either
> > interpret the above to be "256 bits" or I would consider an
> > implementation that used a 256 byte cache of some sort. =A0What would b=
e
> > the utility of a 256 byte wide interface to the Flash? =A0Even the
> > fastest CM3 CPUs can't run at nearly that speed.
>
> I was referring to a 256 byte cache, but perhaps I wasn't clear in my
> description. =A0Such a page cache will be filled from the flash at a spee=
d
> that suits the flash, with a width that matches the flash (perhaps
> something like 64-bit or even 128-bit for performance-optimised parts,
> and maybe as small as 16-bit for price or power optimised parts). =A0On
> the other side of the cache, the processor will read out with a speed
> and width that matches its instruction bus - typically 32-bit.
>
> It is effectively a specialised type of instruction cache - less
> flexible, but much simpler to implement.
>
> I've read about such a cache, but I can't remember which chip used it -
> it may not even have been an ARM device (perhaps it was a ColdFire v2
> microcontroller). =A0And many parts have some sort of "flash accelerator"
> in their feature list, which are probably a similar idea.

Yes, simpler to implement, but definitely less effective.  For
example, lets assume the flash reads out 32 bytes (256 bits) at a rate
of 50 MHz.  That's 1600 MB/s.  It would take 160 nS (8 reads) to fill
the buffer on a jump.  If the destination instruction was in the last
line read, that would be a long stall of the processor.  Of course,
you could make the fill a bit smarter, reading the needed line first,
but if the second instruction word was in the next line the processor
would still have to wait for both reads to complete, rather slow in
that case.

So yes, there are tradeoffs and the fact that this sort of cache is
seldom seen makes me think the bottom line is either work with no
cache (meaning a very minimal cache like a single line cache) or
design an associative cache that doesn't need to refill the whole
cache.  There are volumes of material written on cache memory designs
and yet we keep seeing the same basic ones used in practice... for the
most part.

Rick

Previous 1 234 5 6 Next

Compare ARM MCU Vendors

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group