Reply by Wim Lewis August 15, 20082008-08-15
In article <tfc6a499gladgnnb9a4adimofqbbtml6qo@4ax.com>,
Paul Keinanen  <keinanen@sci.fi> wrote:
>Float add/sub are a bit more costly due to the normalization and >demoralization required,
You know, I'd always suspected that demoralization was a requirement of floating-point support... -- Wim Lewis <wiml@hhhh.org>, Seattle, WA, USA. PGP keyID 27F772C1
Reply by David Brown August 14, 20082008-08-14
Michael Chapman wrote:
> tns1 wrote: >> Paul Keinanen wrote: >>> On Wed, 13 Aug 2008 11:44:43 -0700, tns1 <tns1@cox.net> wrote: >> ... >>> If you already require a 32 bit integer processor, are you sure you >>> need HW floating point ? >>> >>> With 32 bit integer hardware single precision multiplication/division >>> is quite trivial (unless you need full IEEE compliance :-). >>> >> The existing system uses a HW FPU, so its just easier to require this >> on the new system rather than do the up-front analysis to justify >> using a SW solution. As long as it could do basic single precision >> operations and was IEEE754 compliant I suspect it would be OK. Like so >> many projects I need to architect a general solution before all the >> details are known. > > Any decent 'C' compiler on an embedded processor gives a good emulation > of single and double precision floating points operations. Most of these > are IEEE754 compliant (as opposed to many HW implementations which are > not). >
Most decent compilers (and libraries) will give you a choice of compliant behaviour or non-compliant but smaller and faster behaviour. They also (except perhaps for small cpus which are wholly unsuited for complex maths) normally implement both singles and doubles, and sometimes long doubles - hardware implementations are often limited to single precision.
> The only advantage of a HW implementation is speed. If the SW one is > fast enough on the processor you have selected then there is no need > whatsoever for the HW. >
I think hardware floating point can make sense on specialised devices, such as floating point DSPs. But for general usage, there are few advantages and many disadvantages until you are talking about much larger cpu devices such as x86 or PPC architectures.
Reply by Michael Chapman August 14, 20082008-08-14
tns1 wrote:
> Paul Keinanen wrote: >> On Wed, 13 Aug 2008 11:44:43 -0700, tns1 <tns1@cox.net> wrote: > ... >> If you already require a 32 bit integer processor, are you sure you >> need HW floating point ? >> >> With 32 bit integer hardware single precision multiplication/division >> is quite trivial (unless you need full IEEE compliance :-). >> > The existing system uses a HW FPU, so its just easier to require this on > the new system rather than do the up-front analysis to justify using a > SW solution. As long as it could do basic single precision operations > and was IEEE754 compliant I suspect it would be OK. Like so many > projects I need to architect a general solution before all the details > are known.
Any decent 'C' compiler on an embedded processor gives a good emulation of single and double precision floating points operations. Most of these are IEEE754 compliant (as opposed to many HW implementations which are not). The only advantage of a HW implementation is speed. If the SW one is fast enough on the processor you have selected then there is no need whatsoever for the HW.
Reply by David Brown August 14, 20082008-08-14
tns1 wrote:
> Paul Keinanen wrote: >> On Wed, 13 Aug 2008 11:44:43 -0700, tns1 <tns1@cox.net> wrote: > ... >> If you already require a 32 bit integer processor, are you sure you >> need HW floating point ? >> >> With 32 bit integer hardware single precision multiplication/division >> is quite trivial (unless you need full IEEE compliance :-). >> > The existing system uses a HW FPU, so its just easier to require this on > the new system rather than do the up-front analysis to justify using a > SW solution. As long as it could do basic single precision operations > and was IEEE754 compliant I suspect it would be OK. Like so many > projects I need to architect a general solution before all the details > are known.
Do you *really* need IEEE754 compliance, or is that just a buzzword someone has put in without clarification? At its simplest, IEEE754 means using the standard format for single and double float formats, and specifies a required level of accuracy for arithmetic operations. But full compliance requires handling of NaNs, denormalized numbers, rounding modes, signed zeros, and other such features that are very seldom needed - and almost never in embedded systems. Many HW FPU units can't provide full compliance (and they are often limited to single precision), and need software traps and other mechanisms that end up slower than a software-only solution if you really need these features.
Reply by tns1 August 13, 20082008-08-13
Paul Keinanen wrote:
> On Wed, 13 Aug 2008 11:44:43 -0700, tns1 <tns1@cox.net> wrote:
...
> If you already require a 32 bit integer processor, are you sure you > need HW floating point ? > > With 32 bit integer hardware single precision multiplication/division > is quite trivial (unless you need full IEEE compliance :-). >
The existing system uses a HW FPU, so its just easier to require this on the new system rather than do the up-front analysis to justify using a SW solution. As long as it could do basic single precision operations and was IEEE754 compliant I suspect it would be OK. Like so many projects I need to architect a general solution before all the details are known.
Reply by Paul Keinanen August 13, 20082008-08-13
On Wed, 13 Aug 2008 11:44:43 -0700, tns1 <tns1@cox.net> wrote:

> >40-60MIPS performance, 32bit single core >HW floating point
If you already require a 32 bit integer processor, are you sure you need HW floating point ? With 32 bit integer hardware single precision multiplication/division is quite trivial (unless you need full IEEE compliance :-). Float add/sub are a bit more costly due to the normalization and demoralization required, but if the HW supports multiple bit shifts, in which the shift count can be variable (e.g. specified in a register), the 32 bit integer processor can handle floating points quite effectively. On the other hand, I would very much prefer a float/double FPU, if the main CPU is only 8/16 bits. Paul
Reply by tns1 August 13, 20082008-08-13
tns1 wrote:
> I have been looking at current mainstream 32bit embedded processors for > my project (ARM, cortex, PPC, coldfire, etc). I would sure like to find > a single device that has most of what I need, but I am running out of > places to look. The main problem is memory support. > > The wish list is: > cortex M3 > 40-60MIPS performance > simple > 3 banks of Flash 1MB internal, 512KB, 256KB > 2 banks of RAM 1MB, 512KB > 4+uarts > wdt > rtc > lcd support > A/D > D/A > > I don't expect to find one device that has all that built in, but I'd > expect when you start with a core that has 4GB address space there would > be at least a few devices supporting a good chunk of that. Instead there > are lots of devices with no more than 512K internal code Flash, and 64K > internal SRAM. If they have more it is broken up into small > non-contiguous pieces. An example is the STR912. Most of the bells and > whistles I want, high speed internal flash and sram, but just not enough > of it. > > External memory support is either limited, not mapped contiguous with > internal memory, or some hokey bank switched scheme. Give me a few more > address lines or programmable ext. chip selects. > > At the other extreme are devices with much higher clock speeds, almost > no internal memory but huge external memory support. These are typically > large BGA packages and one look at the data sheet tells you that the > design is going to take a lot longer and have an extra 2-4 layers. > > Isn't there anything between these extremes? > > >
After more study, the requirements are a bit different. The addition of an FPU and MMU eliminated many of my prior choices. Instead of focusing on the processor, it is probably more important to look for an existing dev board with most of these features and good tools support, since a custom board could delay SW development. 40-60MIPS performance, 32bit single core HW floating point MMU is optional but preferred Support for at least two separate banks of sectored NOR style Flash internal or external. Bank1 is 1MBmin execute-in-place, bank2 is 512KBmin. A single large bank could work. Support for at least two separate banks of RAM internal or external. Bank1 is 1MB min, bank2 is 512KB min, battery backed SRAM. 1 ethernet qvga lcd support 4+uarts wdt,rtc,A/D,D/A I realize that some of the items like the D/A and extra memory banks or uarts will not be found on a dev board. That's fine as long as the processor will support adding these to the project board. I do need at least one big chunk of NOR Flash, and RAM each. The best fit so far is the Phytec LPC3000 boards. The Logic Card Engine boards look promising too. As interesting as their chips are, I don't see any Infineon boards beyond the 'bare-bones' kits, and working their chips into the design would mean a larger BOM.
Reply by Hans Odeberg August 8, 20082008-08-08
On 8 Aug, 00:46, tns1 <t...@cox.net> wrote:
> I have been looking at current mainstream 32bit embedded processors for > my project (ARM, cortex, PPC, coldfire, etc). I would sure like to find > a single device that has most of what I need, but I am running out of > places to look. The main problem is memory support. > > The wish list is: > cortex M3 > 40-60MIPS performance > simple > 3 banks of Flash 1MB internal, 512KB, 256KB > 2 banks of RAM 1MB, 512KB > 4+uarts > wdt > rtc > lcd support > A/D > D/A > > I don't expect to find one device that has all that built in, but I'd > expect when you start with a core that has 4GB address space there would > be at least a few devices supporting a good chunk of that. Instead there > are lots of devices with no more than 512K internal code Flash, and 64K > internal SRAM. If they have more it is broken up into small > non-contiguous pieces. An example is the STR912. Most of the bells and > whistles I want, high speed internal flash and sram, but just not enough > of it. > > External memory support is either limited, not mapped contiguous with > internal memory, or some hokey bank switched scheme. Give me a few more > address lines or programmable ext. chip selects. > > At the other extreme are devices with much higher clock speeds, almost > no internal memory but huge external memory support. These are typically > large BGA packages and one look at the data sheet tells you that the > design is going to take a lot longer and have an extra 2-4 layers. > > Isn't there anything between these extremes?
A one-chip solution, off the shelf, with the amount of flash and ram you are asking for is unlikely to exist, as other posters have pointed out. If a simple board design is more important to you than price: have you considered buying a module, with chip + memory mounted on a small PCB? Googling for "arm module" will give you a few hits.
Reply by David Brown August 8, 20082008-08-08
tns1 wrote:
> I have been looking at current mainstream 32bit embedded processors for > my project (ARM, cortex, PPC, coldfire, etc). I would sure like to find > a single device that has most of what I need, but I am running out of > places to look. The main problem is memory support. > > The wish list is: > cortex M3 > 40-60MIPS performance > simple > 3 banks of Flash 1MB internal, 512KB, 256KB > 2 banks of RAM 1MB, 512KB > 4+uarts > wdt > rtc > lcd support > A/D > D/A > > I don't expect to find one device that has all that built in, but I'd > expect when you start with a core that has 4GB address space there would > be at least a few devices supporting a good chunk of that. Instead there > are lots of devices with no more than 512K internal code Flash, and 64K > internal SRAM. If they have more it is broken up into small > non-contiguous pieces. An example is the STR912. Most of the bells and > whistles I want, high speed internal flash and sram, but just not enough > of it. > > External memory support is either limited, not mapped contiguous with > internal memory, or some hokey bank switched scheme. Give me a few more > address lines or programmable ext. chip selects. > > At the other extreme are devices with much higher clock speeds, almost > no internal memory but huge external memory support. These are typically > large BGA packages and one look at the data sheet tells you that the > design is going to take a lot longer and have an extra 2-4 layers. > > Isn't there anything between these extremes? > >
There is a conflict between the idle process parameters for making a RAM chip, a Flash device, and a microcontroller - they are all different (things like number and type of layers, size of features, type of doping, etc.). Thus if you start with a microcontroller-optimised process, each bit of flash is significantly bigger, slower, and more expensive than if you start with a flash-optimised process. So most microcontrollers have a relatively small flash (exceptions include some FreeScale MPC devices with up to 1 MB flash - costing something like $30 more than the 0 MB flash version, and a few devices made with a flash-optimised process, which therefore have a bigger, slower, and more power-hungry microcontroller part). This leads to two separate types of chips - microcontrollers, with up to something like 512K flash and 64K ram, and embedded microprocessors with no flash, and external databus, and typically a block or two of internal RAM (which is often good for the stack or other fast-access memory). There is not much in between. If you have a full 32-bit databus, with something like 24 address pins and a bunch of control pins and chip selects, you quickly have 80 pins for the databus alone. If the embedded microprocessor has a range of peripherals, especially things like lcd support or Ethernet, you get beyond the range of cheap non-bga packages very fast.
Reply by Jim Granville August 7, 20082008-08-07
tns1 wrote:
> I have been looking at current mainstream 32bit embedded processors for > my project (ARM, cortex, PPC, coldfire, etc). I would sure like to find > a single device that has most of what I need, but I am running out of > places to look. The main problem is memory support. > > The wish list is: > cortex M3 > 40-60MIPS performance > simple > 3 banks of Flash 1MB internal, 512KB, 256KB > 2 banks of RAM 1MB, 512KB > 4+uarts > wdt > rtc > lcd support > A/D > D/A > > I don't expect to find one device that has all that built in, but I'd > expect when you start with a core that has 4GB address space there would > be at least a few devices supporting a good chunk of that.
Why ? that is rather strange reverse-logic. Chips are built with memory that customer NEED, not with memory that the bus might be able to address!. Price matters. > Instead there
> are lots of devices with no more than 512K internal code Flash, and 64K > internal SRAM.
Because that is all they need, for most embedded applications. <snip>
> At the other extreme are devices with much higher clock speeds, almost > no internal memory but huge external memory support. These are typically > large BGA packages and one look at the data sheet tells you that the > design is going to take a lot longer and have an extra 2-4 layers. > > Isn't there anything between these extremes?
Not really, One is a Microcontroller, and one is a Microprecessor. The Cortex M<3 targets bottom-end Microcontrollers in the 32bit space. Your specs, especially the large RAM, push you into Microprocessor space. The top end Automotive space tends to have the most flash, up to 4MBytes. Perhaps Look at Infineons TriCore : The TC116x has 1.5MByte in QFP, and you add the RAM ? I think Freescale have some large-flash 1.5MBytes? PowerPC variants, also in QFP. -jg