Looking for ARM system with RTOS| page 3

Reply by ●January 3, 20132013-01-03

On Thursday, January 3, 2013 12:32:46 PM UTC-6, Mark Borgerson wrote:

> The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
> 
> 168Mhz STM32F407 is about $12.  That seems pretty competitive to
> 
> me ;-)  How much work would it take to tune the STM32 and would you
> 
> sell enough with an $80 lower price to be worth the effort.

The OP was talking about the ADSP-21xx fixed point DSP series.  The ADSP-21XXX  part you listed is floating point.  21XX and 21XXX are also both a bit long in the tooth.

For a better comparison look at something like the newer blackfin series, ADSP-BF592 @ 200MHz is ~$6 qty. 1 at digikey.  TI also has some fixed point DSP 'controllers' that are similar.

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <c9ffae6a-371d-4350-a548-4028c1eab2ed@googlegroups.com>, 
amdyer@gmail.com says...
> 
> On Thursday, January 3, 2013 12:32:46 PM UTC-6, Mark Borgerson wrote:
> 
> > The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
> > 
> > 168Mhz STM32F407 is about $12.  That seems pretty competitive to
> > 
> > me ;-)  How much work would it take to tune the STM32 and would you
> > 
> > sell enough with an $80 lower price to be worth the effort.
> 
> The OP was talking about the ADSP-21xx fixed point DSP series.  The ADSP-21XXX  part you listed is floating point.  21XX and 21XXX are also both a bit long in the tooth.
> 
> For a better comparison look at something like the newer blackfin series, ADSP-BF592 @ 200MHz is ~$6 qty. 1 at digikey.  TI also has some fixed point DSP 'controllers' that are similar.

OK.  I'm not familiar enough with  Analog Devices DSPs to know the 
difference.  I just searched for ADSP-21 on digikey and didn't get any 
of the fixed point units when selecting for 50MHz units.  A return visit 
showed some ADSP-21xx units, but still at $18 to $20.

I'm also not familiar enough with DSP to know whether the 32-bit FPU
on the SMT32F4xx series would give you any advantage over the
fixed-point DSP chips.  IIRC, the Cortex-M4 does have some SIMD 
instructions useful for DSP work, but I don't know if they use
the FPU or not.

Mark Borgerson

Reply by Jon Kirwan ●January 3, 20132013-01-03

On Thu, 3 Jan 2013 10:32:46 -0800, Mark Borgerson
<mborgerson@comcast.net> wrote:

>In article <u3eae8d0bt9c51qq0tbp30mucskp1o4csd@4ax.com>, 
>jonk@infinitefactors.org says...
>> 
>> On Wed, 2 Jan 2013 22:43:15 -0800, Mark Borgerson
>> <mborgerson@comcast.net> wrote:
>> 
>> >In article <9289e8p4ecr3qalegrs5avpq9nmk1ap8jb@4ax.com>, 
>> >jonk@infinitefactors.org says...
>> >> 
>> >> On Wed, 2 Jan 2013 12:52:47 -0800, Mark Borgerson
>> >> <mborgerson@comcast.net> wrote:
>> >> 
>> >> >In article <l996e8dcrons0s9d6104r0t500fra0c0t2@4ax.com>, 
>> >> >jonk@infinitefactors.org says...
>> >> >> 
>> >> >> On Tue, 01 Jan 2013 12:09:40 +0200, upsidedown@downunder.com
>> >> >> wrote:
>> >> >> 
>> >> >> >On Tue, 1 Jan 2013 09:54:30 +0800, "Bruce Varley" <bv@NoSpam.com>
>> >> >> >wrote:
>> >> >> >
>> >> >> >>I need:
>> >> >> >>
>> >> >> >>o  CPU clock 200MHz or higher.
>> >> >> >>
>> >> >> >>o  2 serial ports, with access to the logic level lines on at least one (LV 
>> >> >> >>OK).
>> >> >> >>
>> >> >> >>o  USB support. Socket support also would  be nice, not essential.
>> >> >> >>
>> >> >> >>o  Some sort of file system.
>> >> >> >>
>> >> >> >>o  Guaranteed turnround of 10mS, even lower would be nice. My ARM Linux 
>> >> >> >>won'd do better than 20.
>> ><<SNIP>>
>> >> >> 
>> >> >> 10ms turnaround would be... unacceptable.
>> >> >> 
>> >> >I'm a bit puzzled here.  I usually read '10ms'   as 10 milliseconds.
>> >> 
>> >> As do I.
>> >> 
>> >> >That seems like a lot of time for most embedded systems RTOS
>> >> >variants, which have task switch times in the low microseconds
>> >> >on chips like 160MHz  ARM-Cortex STM32s.
>> >> 
>> >> I was using a 20ns cycle time ADSP-21xx processor (50MHz.)
>> >> It's a DSP with fixed cycle counts (1) for each instruction
>> >> and a guaranteed interrupt latency that NEVER varies (with
>> >> certain, inconsequential [to my application] conditions being
>> >> met.)
>> >> 
>> >> >10milliseconds would certainly be too long a response time on
>> >> >many of the instruments I've developed--none of which use
>> >> >an RTOS.   I'm just now starting to play around with 
>> >> >ChiBios and UCoS-II  on the STM32 chips.
>> >> 
>> >> In measurement instruments, which may be used in closed loop
>> >> control systems, predictability (both in terms of phase delay
>> >> relative to the sensor observation and also in terms of the
>> >> variability allowed in that phase delay) is vital.
>> >> 
>> >> I shoot for (and achieve where it is important) variability
>> >> that is measured as 0, or if forced in very small integers >0
>> >> like 1 or maybe 2, of cycle variation... measurement to
>> >> measurement... both in sampling the sensor as well as in
>> >> outputting it via a DAC. (I can't help what happens after.)
>> >> In the best of all cases, I implement the closed loop control
>> >> in the instrument, as well, so that there is no variability
>> >> caused by an external ADC and remaining system. In that case,
>> >> I drive the 0-100% control with similar attention to
>> >> precision control of the external device (heater, boule
>> >> puller, etc.) I also go to the trouble to ensure, where
>> >> branching code exists, that each branch takes exactly the
>> >> same number of cycles.
>> >> 
>> >> I very much dislike, in cases like this, devices with varying
>> >> interrupt latencies (which is almost guaranteed to happen if
>> >> the processor has instructions with varying execution time.)
>> >> I can control my code and the number of cycles each edge of
>> >> it may take, but the hardware latency is out of my control.
>> >> So I look for processors where it is predictable, if I need
>> >> that.
>> >> 
>> >> An STM32 would not qualify in the case I am thinking about.
>> >> 
>> >IIRC, the Cortex M4 instructions which would cause the greatest
>> >variation in interrupt latency (load and store multiple and divide)
>> >are, themselves, interruptible.  I would guess that the interrupt
>> >latency variation would be on the order of 1 to 2 cycle times---
>> >or about 12.5 nSec for a 168MHz clock.  The overall latency is
>> >listed as 12 clock cycles or about 60-70nSec.
>> 
>> I gained some slightly useful benefits by having exactly 0
>> cycle variation in the application I'm talking about. One
>> cycle (20ns in that application) of variation would have made
>> a difference to me. The fact that I didn't have to add
>> hardware to gain that tiny advantage ALSO was a useful
>> benefit.
>> 
>> In the M4, there is also a pipeline and, if I remember,
>> "faults" can occur not only in one stage. (I might be wrong
>> about that.) You have to consider everything -- instruction
>> faults (memory, etc.) But I admit I'm pretty ignorant of the
>> M4, too.
>> 
>> >I can see that multiple-cycle instructions  with variable execution time 
>> >inside the interrupt handler could cause phase variations in the output.  
>> >It might requirem more work to eliminate them than would be the case
>> >with a DSP having only a few rare cases to consider.
>> >
>> >If you're using a DAC in the loop and want consistent phase delays,
>> >does that require a flash DAC?  With a successive approximation DAC, the 
>> >delay until you get the desired output would seem to depend on the
>> >value output unless there is a fast sample-and-hold between the
>> >DAC and the control system.
>> 
>> I added the full closed loop control PID into the instrument.
>> (It didn't have the ability beforehand.) In doing so, there
>> was no DAC involved at that stage, anymore.
>> 
>> >If you want outputs free of all phase jitter, a sample and hold
>> >triggered by a hardware clock could solve the problem.  The problem
>> >then becomes what synchronization delays are acceptable.
>> 
>> Price, size, power, etc., all mattered. Very competitive
>> marketplace in that case.
>> 
>> Jon
>
>The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
>168Mhz STM32F407 is about $12.  That seems pretty competitive to
>me ;-)  How much work would it take to tune the STM32 and would you
>sell enough with an $80 lower price to be worth the effort.

The ADSP-21xxx is not even close to the ADSP-21xx and I
wasn't using the ADSP-21061KSZ. It was an ADSP-2111 and
ADSP-2105. They were MUCH cheaper at the time (circa early
1990's) and the competition elsewhere was effectively zero.
Since then there are many more options and many more players
and the ADSP-21xx processors I was using probably aren't even
available (much, if at all.) If I were doing this today, I'd
pick something else.

>I suspect the STM32 is lower in power at 168Mhz than the DSP
>at 50MHz, but I haven't verified that guess.

There was NO floating point on the units I used. A nice
barrel shifter (combinatorial, one-cycle) though and I used
it for writing my own floating point. Power consumption was
quite low --- for the time.

Jon

Reply by Jon Kirwan ●January 3, 20132013-01-03

On Thu, 3 Jan 2013 13:13:08 -0800, Mark Borgerson
<mborgerson@comcast.net> wrote:

>In article <c9ffae6a-371d-4350-a548-4028c1eab2ed@googlegroups.com>, 
>amdyer@gmail.com says...
>> 
>> On Thursday, January 3, 2013 12:32:46 PM UTC-6, Mark Borgerson wrote:
>> 
>> > The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
>> > 
>> > 168Mhz STM32F407 is about $12.  That seems pretty competitive to
>> > 
>> > me ;-)  How much work would it take to tune the STM32 and would you
>> > 
>> > sell enough with an $80 lower price to be worth the effort.
>> 
>> The OP was talking about the ADSP-21xx fixed point DSP series.  The ADSP-21XXX  part you listed is floating point.  21XX and 21XXX are also both a bit long in the tooth.
>> 
>> For a better comparison look at something like the newer blackfin series, ADSP-BF592 @ 200MHz is ~$6 qty. 1 at digikey.  TI also has some fixed point DSP 'controllers' that are similar.
>
>OK.  I'm not familiar enough with  Analog Devices DSPs to know the 
>difference.  I just searched for ADSP-21 on digikey and didn't get any 
>of the fixed point units when selecting for 50MHz units.  A return visit 
>showed some ADSP-21xx units, but still at $18 to $20.
>
>I'm also not familiar enough with DSP to know whether the 32-bit FPU
>on the SMT32F4xx series would give you any advantage over the
>fixed-point DSP chips.  IIRC, the Cortex-M4 does have some SIMD 
>instructions useful for DSP work, but I don't know if they use
>the FPU or not.
>
>Mark Borgerson

ALL FPUs are HUGE and consume LOTS OF POWER. You pay in die
space, which reduces yield, increases cost, and burns power
whether or not you need the FP at the moment. You pay for the
beast every single cycle, need it or not.

The wonderful and brilliant idea behind the ADSP-21xx (an
integer cpu from top to bottom) was it's support for FP in
the form of specialized integer ALU functionality. You pay a
LOT less if you are writing what amounts to your own FP
microcode and have specialized units for the purpose. The
main unit they provided was the combinatorial barrel shifter
(and a MAC.) You pay a LOT LESS for those two on every cycle,
and they take up so much less die space, too.

Besides all that, they have other useful abilities for some
applications that no FP unit designer would consider making
available in a hardware FP -- they are focused on providing
an easy to use FP unit. But actually, the raw guts underneath
the hood of an FP unit have purposes OTHER than FP, too. But
they don't give you direct access to any of that because that
isn't their market. So you pay for a burdensome, massive die
with LOTS of under-the-hood functional units to get the job
done, and you DO NOT get access to it in raw form so that you
can take other advantage of it.

The ADSP-21xx simply exposed a couple of bare-bones bits,
kept it small, and let you write the "firmware" you want.

When doing an FFT for example, there are some optimizations
to the process that you CANNOT DO with a floating point unit
but CAN DO when you have the raw pieces needed to make one,
which allow you to perform very fast FFTs.

It was a good idea.

The reason it just isn't done much is that the only clients
for such a beast are programmers who have thorough numerical
methods experiences and are very good at math and writing FP
microcode. Which is a TINY market, as they found out.

But the concept, for those of us who CAN do those things, is
fantastic.

Jon

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <7p4ce8tmi27tlhmjlqn0em519nlftk2c4n@4ax.com>, 
jonk@infinitefactors.org says...
> 
> On Thu, 3 Jan 2013 13:13:08 -0800, Mark Borgerson
> <mborgerson@comcast.net> wrote:
> 
> >In article <c9ffae6a-371d-4350-a548-4028c1eab2ed@googlegroups.com>, 
> >amdyer@gmail.com says...
> >> 
> >> On Thursday, January 3, 2013 12:32:46 PM UTC-6, Mark Borgerson wrote:
> >> 
> >> > The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
> >> > 
> >> > 168Mhz STM32F407 is about $12.  That seems pretty competitive to
> >> > 
> >> > me ;-)  How much work would it take to tune the STM32 and would you
> >> > 
> >> > sell enough with an $80 lower price to be worth the effort.
> >> 
> >> The OP was talking about the ADSP-21xx fixed point DSP series.  The ADSP-21XXX  part you listed is floating point.  21XX and 21XXX are also both a bit long in the tooth.
> >> 
> >> For a better comparison look at something like the newer blackfin series, ADSP-BF592 @ 200MHz is ~$6 qty. 1 at digikey.  TI also has some fixed point DSP 'controllers' that are similar.
> >
> >OK.  I'm not familiar enough with  Analog Devices DSPs to know the 
> >difference.  I just searched for ADSP-21 on digikey and didn't get any 
> >of the fixed point units when selecting for 50MHz units.  A return visit 
> >showed some ADSP-21xx units, but still at $18 to $20.
> >
> >I'm also not familiar enough with DSP to know whether the 32-bit FPU
> >on the SMT32F4xx series would give you any advantage over the
> >fixed-point DSP chips.  IIRC, the Cortex-M4 does have some SIMD 
> >instructions useful for DSP work, but I don't know if they use
> >the FPU or not.
> >
> >Mark Borgerson
> 
> ALL FPUs are HUGE and consume LOTS OF POWER. You pay in die
> space, which reduces yield, increases cost, and burns power
> whether or not you need the FP at the moment. You pay for the
> beast every single cycle, need it or not.

HUGE and LOTS OF POWER are relative---especially on chips
with 1MB of flash and 192KB of RAM, high and low speed
USB, ethernet and camera interfaces.  As for increasing 
cost, the STM32F405 with FPU is within a dollar of the
price of the STM32F205 without FPU.

I guess power is also relative.  The STM32F405 at full
speed uses about 100mA at 3.3V.  Shut off most of
the peripherals, and the power goes down by half.

The STM32F205, without FPU and at 120MHZ instead of
168MHZ uses about 80mA under the same conditions.
So it looks like the increase in power to get the
FPU is about 25%--but you get 25%  higher clock
speed as well. 

I was running some RTOS test code yesterday and
comparing the times to generate tables of sine 
values with and without the FPU.  With the FPU
was 8 to 10X faster.  I can't really say what the
power consumption was, since the Discovery board
was running off the USB power.  I've got an 
Olimex board where I can measure the current,
so I'll give that a try.

The Cortex CM4 does have control bits to enable and
disable the FPU, but I don't know their effect on
power consumption, or whether you would want to
do that on a function-by-function basis.

The easy way to save power on the CM4 is to just
shut off the CPU clock until the next iterrupt.
> 
> The wonderful and brilliant idea behind the ADSP-21xx (an
> integer cpu from top to bottom) was it's support for FP in
> the form of specialized integer ALU functionality. You pay a
> LOT less if you are writing what amounts to your own FP
> microcode and have specialized units for the purpose. The
> main unit they provided was the combinatorial barrel shifter
> (and a MAC.) You pay a LOT LESS for those two on every cycle,
> and they take up so much less die space, too.

I agree that a specialized DSP may have advantages if all
you want is number crunching and  limited IO at minimum
cost and minimum power.   However, the OP in this thread
was looking for a more complex system
> 
> Besides all that, they have other useful abilities for some
> applications that no FP unit designer would consider making
> available in a hardware FP -- they are focused on providing
> an easy to use FP unit. But actually, the raw guts underneath
> the hood of an FP unit have purposes OTHER than FP, too. But
> they don't give you direct access to any of that because that
> isn't their market. So you pay for a burdensome, massive die
> with LOTS of under-the-hood functional units to get the job
> done, and you DO NOT get access to it in raw form so that you
> can take other advantage of it.
> 
> The ADSP-21xx simply exposed a couple of bare-bones bits,
> kept it small, and let you write the "firmware" you want.

Which are the bare-bone bits that they exposed?
> 
> When doing an FFT for example, there are some optimizations
> to the process that you CANNOT DO with a floating point unit
> but CAN DO when you have the raw pieces needed to make one,
> which allow you to perform very fast FFTs.
> 
> It was a good idea.
> 
> The reason it just isn't done much is that the only clients
> for such a beast are programmers who have thorough numerical
> methods experiences and are very good at math and writing FP
> microcode. Which is a TINY market, as they found out.
> 
> But the concept, for those of us who CAN do those things, is
> fantastic.
> 
I guess it's possible to make a living that way.  What was the
old maxim ---"if it were easy, everyone would do it"

Mark Borgerson

Reply by Jon Kirwan ●January 3, 20132013-01-03

On Thu, 3 Jan 2013 17:41:05 -0800, Mark Borgerson
<mborgerson@comcast.net> wrote:

>In article <7p4ce8tmi27tlhmjlqn0em519nlftk2c4n@4ax.com>, 
>jonk@infinitefactors.org says...
>> 
>> On Thu, 3 Jan 2013 13:13:08 -0800, Mark Borgerson
>> <mborgerson@comcast.net> wrote:
>> 
>> >In article <c9ffae6a-371d-4350-a548-4028c1eab2ed@googlegroups.com>, 
>> >amdyer@gmail.com says...
>> >> 
>> >> On Thursday, January 3, 2013 12:32:46 PM UTC-6, Mark Borgerson wrote:
>> >> 
>> >> > The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
>> >> > 
>> >> > 168Mhz STM32F407 is about $12.  That seems pretty competitive to
>> >> > 
>> >> > me ;-)  How much work would it take to tune the STM32 and would you
>> >> > 
>> >> > sell enough with an $80 lower price to be worth the effort.
>> >> 
>> >> The OP was talking about the ADSP-21xx fixed point DSP series.  The ADSP-21XXX  part you listed is floating point.  21XX and 21XXX are also both a bit long in the tooth.
>> >> 
>> >> For a better comparison look at something like the newer blackfin series, ADSP-BF592 @ 200MHz is ~$6 qty. 1 at digikey.  TI also has some fixed point DSP 'controllers' that are similar.
>> >
>> >OK.  I'm not familiar enough with  Analog Devices DSPs to know the 
>> >difference.  I just searched for ADSP-21 on digikey and didn't get any 
>> >of the fixed point units when selecting for 50MHz units.  A return visit 
>> >showed some ADSP-21xx units, but still at $18 to $20.
>> >
>> >I'm also not familiar enough with DSP to know whether the 32-bit FPU
>> >on the SMT32F4xx series would give you any advantage over the
>> >fixed-point DSP chips.  IIRC, the Cortex-M4 does have some SIMD 
>> >instructions useful for DSP work, but I don't know if they use
>> >the FPU or not.
>> >
>> >Mark Borgerson
>> 
>> ALL FPUs are HUGE and consume LOTS OF POWER. You pay in die
>> space, which reduces yield, increases cost, and burns power
>> whether or not you need the FP at the moment. You pay for the
>> beast every single cycle, need it or not.
>
>HUGE and LOTS OF POWER are relative---especially on chips
>with 1MB of flash and 192KB of RAM, high and low speed
>USB, ethernet and camera interfaces.  As for increasing 
>cost, the STM32F405 with FPU is within a dollar of the
>price of the STM32F205 without FPU.
>
>I guess power is also relative.  The STM32F405 at full
>speed uses about 100mA at 3.3V.  Shut off most of
>the peripherals, and the power goes down by half.
>
>The STM32F205, without FPU and at 120MHZ instead of
>168MHZ uses about 80mA under the same conditions.
>So it looks like the increase in power to get the
>FPU is about 25%--but you get 25%  higher clock
>speed as well. 
>
>I was running some RTOS test code yesterday and
>comparing the times to generate tables of sine 
>values with and without the FPU.  With the FPU
>was 8 to 10X faster.  I can't really say what the
>power consumption was, since the Discovery board
>was running off the USB power.  I've got an 
>Olimex board where I can measure the current,
>so I'll give that a try.
>
>The Cortex CM4 does have control bits to enable and
>disable the FPU, but I don't know their effect on
>power consumption, or whether you would want to
>do that on a function-by-function basis.
>
>The easy way to save power on the CM4 is to just
>shut off the CPU clock until the next iterrupt.

Please keep the context in mind. I've had to remind you
before. This is 1990.

>> The wonderful and brilliant idea behind the ADSP-21xx (an
>> integer cpu from top to bottom) was it's support for FP in
>> the form of specialized integer ALU functionality. You pay a
>> LOT less if you are writing what amounts to your own FP
>> microcode and have specialized units for the purpose. The
>> main unit they provided was the combinatorial barrel shifter
>> (and a MAC.) You pay a LOT LESS for those two on every cycle,
>> and they take up so much less die space, too.
>
>I agree that a specialized DSP may have advantages if all
>you want is number crunching and  limited IO at minimum
>cost and minimum power.   However, the OP in this thread
>was looking for a more complex system

Yes, we've been digressing. Or, at least, I have been. You
can speak for yourself, of course.

>> Besides all that, they have other useful abilities for some
>> applications that no FP unit designer would consider making
>> available in a hardware FP -- they are focused on providing
>> an easy to use FP unit. But actually, the raw guts underneath
>> the hood of an FP unit have purposes OTHER than FP, too. But
>> they don't give you direct access to any of that because that
>> isn't their market. So you pay for a burdensome, massive die
>> with LOTS of under-the-hood functional units to get the job
>> done, and you DO NOT get access to it in raw form so that you
>> can take other advantage of it.
>> 
>> The ADSP-21xx simply exposed a couple of bare-bones bits,
>> kept it small, and let you write the "firmware" you want.
>
>Which are the bare-bone bits that they exposed?

I mentioned them. The combinatorial barrel-shifter, the MAC
(which as it was designed was useful for FP work, as well as
the regular integer work), and the two specialized DIV
instructions they included. (Too long an explanation here,
but suffice it that they didn't actually divide -- the
implemented a subset step only.)

I liked the crafted balance they took.

>> When doing an FFT for example, there are some optimizations
>> to the process that you CANNOT DO with a floating point unit
>> but CAN DO when you have the raw pieces needed to make one,
>> which allow you to perform very fast FFTs.
>> 
>> It was a good idea.
>> 
>> The reason it just isn't done much is that the only clients
>> for such a beast are programmers who have thorough numerical
>> methods experiences and are very good at math and writing FP
>> microcode. Which is a TINY market, as they found out.
>> 
>> But the concept, for those of us who CAN do those things, is
>> fantastic.
>> 
>I guess it's possible to make a living that way.  What was the
>old maxim ---"if it were easy, everyone would do it"

Well, the manufacturers are looking for larger audiences and
do NOT cater to niche markets until and unless every other
better profit center has been exhausted.

At the time, I benefited from a narrow moment when doing a
full FP implementation wasn't in the cards, yet the need for
fast implementation was. I could implement a floating point
complex-in, complex-out FFT that performed it's work in less
time than their later FP versions of the CPU could do.
Because I could take advantage of things. Eventually, the
BlackFin and later incarnations advanced in clock rates AND
performance and exceeded the older parts they no longer sold.
But that's normal progression.

...

I'll give another example of my mindset. The current spate of
multi-GHz x86 processors from Intel are fabricated with
feature sizes and GTL technology (unless they've got
something still newer since I last looked) that would permit
the production of a VERY LOW power 100MHz laptop that could
easily run for quite some time using nothing more than a few
AA batteries. Nothing special. Just cheap Costco alkalines.
(In fact, it was done once with the HP Omnibook 300/Win
3.1... but with older feature sizes.) The current technology
would wipe the floor with that older HP Omnibook, which
itself put Windows completely in ROM (no boot from secondary
storage) and would run for weeks on AA batteries available
anywhere in the world. I need nothing more than a 80386 using
those feature sizes -- no FP -- and running at 66MHz to
100MHz for word processing. The nice thing about that
specific Omnibook (and none of the others) is that there was
no special battery technology, it weighed almost nothing,
included a wonderful pop-out mouse built in, and required
nothing special when you closed it. It just shut off all
power except and only what was required to retain the static
ram. So when I opened it, I was exactly where I left off --
cursor, etc -- with exactly 0 seconds wait. When someone
asked me a question, I closed it, answered the question,
opened the laptop, and just kept on going. Weight was VERY
low -- lower than any laptop I'm aware of today.

But their is no longer a marketplace for this. So I can only
get laptops with MUCH MUCH shorter active runtimes, despite
huge advances in battery technology (for much more cost) and
despite huge advances in FAB technology (which could be used
to greatly reduce power consumption from that time.)

Jon

Reply by Jon Kirwan ●January 3, 20132013-01-03

On Thu, 03 Jan 2013 18:23:13 -0800, Jon Kirwan
<jonk@infinitefactors.org> wrote:

>their

there

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <gq3ce898jeru18r5ufgarts0tb7kfl88ri@4ax.com>, 
jonk@infinitefactors.org says...
> 
> On Thu, 3 Jan 2013 10:32:46 -0800, Mark Borgerson
> <mborgerson@comcast.net> wrote:
> 
> >In article <u3eae8d0bt9c51qq0tbp30mucskp1o4csd@4ax.com>, 
> >jonk@infinitefactors.org says...
> >> 
> >> On Wed, 2 Jan 2013 22:43:15 -0800, Mark Borgerson
> >> <mborgerson@comcast.net> wrote:
> >> 
> >> >In article <9289e8p4ecr3qalegrs5avpq9nmk1ap8jb@4ax.com>, 
> >> >jonk@infinitefactors.org says...
> >> >> 
> >> >> On Wed, 2 Jan 2013 12:52:47 -0800, Mark Borgerson
> >> >> <mborgerson@comcast.net> wrote:
> >> >> 
> >> >> >In article <l996e8dcrons0s9d6104r0t500fra0c0t2@4ax.com>, 
> >> >> >jonk@infinitefactors.org says...
> >> >> >> 
> >> >> >> On Tue, 01 Jan 2013 12:09:40 +0200, upsidedown@downunder.com
> >> >> >> wrote:
> >> >> >> 
> >> >> >> >On Tue, 1 Jan 2013 09:54:30 +0800, "Bruce Varley" <bv@NoSpam.com>
> >> >> >> >wrote:
> >> >> >> >
> >> >> >> >>I need:
> >> >> >> >>
> >> >> >> >>o  CPU clock 200MHz or higher.
> >> >> >> >>
> >> >> >> >>o  2 serial ports, with access to the logic level lines on at least one (LV 
> >> >> >> >>OK).
> >> >> >> >>
> >> >> >> >>o  USB support. Socket support also would  be nice, not essential.
> >> >> >> >>
> >> >> >> >>o  Some sort of file system.
> >> >> >> >>
> >> >> >> >>o  Guaranteed turnround of 10mS, even lower would be nice. My ARM Linux 
> >> >> >> >>won'd do better than 20.
> >> ><<SNIP>>
> >> >> >> 
> >> >> >> 10ms turnaround would be... unacceptable.
> >> >> >> 
> >> >> >I'm a bit puzzled here.  I usually read '10ms'   as 10 milliseconds.
> >> >> 
> >> >> As do I.
> >> >> 
> >> >> >That seems like a lot of time for most embedded systems RTOS
> >> >> >variants, which have task switch times in the low microseconds
> >> >> >on chips like 160MHz  ARM-Cortex STM32s.
> >> >> 
> >> >> I was using a 20ns cycle time ADSP-21xx processor (50MHz.)
> >> >> It's a DSP with fixed cycle counts (1) for each instruction
> >> >> and a guaranteed interrupt latency that NEVER varies (with
> >> >> certain, inconsequential [to my application] conditions being
> >> >> met.)
> >> >> 
> >> >> >10milliseconds would certainly be too long a response time on
> >> >> >many of the instruments I've developed--none of which use
> >> >> >an RTOS.   I'm just now starting to play around with 
> >> >> >ChiBios and UCoS-II  on the STM32 chips.
> >> >> 
> >> >> In measurement instruments, which may be used in closed loop
> >> >> control systems, predictability (both in terms of phase delay
> >> >> relative to the sensor observation and also in terms of the
> >> >> variability allowed in that phase delay) is vital.
> >> >> 
> >> >> I shoot for (and achieve where it is important) variability
> >> >> that is measured as 0, or if forced in very small integers >0
> >> >> like 1 or maybe 2, of cycle variation... measurement to
> >> >> measurement... both in sampling the sensor as well as in
> >> >> outputting it via a DAC. (I can't help what happens after.)
> >> >> In the best of all cases, I implement the closed loop control
> >> >> in the instrument, as well, so that there is no variability
> >> >> caused by an external ADC and remaining system. In that case,
> >> >> I drive the 0-100% control with similar attention to
> >> >> precision control of the external device (heater, boule
> >> >> puller, etc.) I also go to the trouble to ensure, where
> >> >> branching code exists, that each branch takes exactly the
> >> >> same number of cycles.
> >> >> 
> >> >> I very much dislike, in cases like this, devices with varying
> >> >> interrupt latencies (which is almost guaranteed to happen if
> >> >> the processor has instructions with varying execution time.)
> >> >> I can control my code and the number of cycles each edge of
> >> >> it may take, but the hardware latency is out of my control.
> >> >> So I look for processors where it is predictable, if I need
> >> >> that.
> >> >> 
> >> >> An STM32 would not qualify in the case I am thinking about.
> >> >> 
> >> >IIRC, the Cortex M4 instructions which would cause the greatest
> >> >variation in interrupt latency (load and store multiple and divide)
> >> >are, themselves, interruptible.  I would guess that the interrupt
> >> >latency variation would be on the order of 1 to 2 cycle times---
> >> >or about 12.5 nSec for a 168MHz clock.  The overall latency is
> >> >listed as 12 clock cycles or about 60-70nSec.
> >> 
> >> I gained some slightly useful benefits by having exactly 0
> >> cycle variation in the application I'm talking about. One
> >> cycle (20ns in that application) of variation would have made
> >> a difference to me. The fact that I didn't have to add
> >> hardware to gain that tiny advantage ALSO was a useful
> >> benefit.
> >> 
> >> In the M4, there is also a pipeline and, if I remember,
> >> "faults" can occur not only in one stage. (I might be wrong
> >> about that.) You have to consider everything -- instruction
> >> faults (memory, etc.) But I admit I'm pretty ignorant of the
> >> M4, too.
> >> 
> >> >I can see that multiple-cycle instructions  with variable execution time 
> >> >inside the interrupt handler could cause phase variations in the output.  
> >> >It might requirem more work to eliminate them than would be the case
> >> >with a DSP having only a few rare cases to consider.
> >> >
> >> >If you're using a DAC in the loop and want consistent phase delays,
> >> >does that require a flash DAC?  With a successive approximation DAC, the 
> >> >delay until you get the desired output would seem to depend on the
> >> >value output unless there is a fast sample-and-hold between the
> >> >DAC and the control system.
> >> 
> >> I added the full closed loop control PID into the instrument.
> >> (It didn't have the ability beforehand.) In doing so, there
> >> was no DAC involved at that stage, anymore.
> >> 
> >> >If you want outputs free of all phase jitter, a sample and hold
> >> >triggered by a hardware clock could solve the problem.  The problem
> >> >then becomes what synchronization delays are acceptable.
> >> 
> >> Price, size, power, etc., all mattered. Very competitive
> >> marketplace in that case.
> >> 
> >> Jon
> >
> >The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
> >168Mhz STM32F407 is about $12.  That seems pretty competitive to
> >me ;-)  How much work would it take to tune the STM32 and would you
> >sell enough with an $80 lower price to be worth the effort.
> 
> The ADSP-21xxx is not even close to the ADSP-21xx and I
> wasn't using the ADSP-21061KSZ. It was an ADSP-2111 and
> ADSP-2105. They were MUCH cheaper at the time (circa early
> 1990's) and the competition elsewhere was effectively zero.
> Since then there are many more options and many more players
> and the ADSP-21xx processors I was using probably aren't even
> available (much, if at all.) If I were doing this today, I'd
> pick something else.
> 
> >I suspect the STM32 is lower in power at 168Mhz than the DSP
> >at 50MHz, but I haven't verified that guess.
> 
> There was NO floating point on the units I used. A nice
> barrel shifter (combinatorial, one-cycle) though and I used
> it for writing my own floating point. Power consumption was
> quite low --- for the time.
> 

I think all the Cortex M3 and M4s have single cycle barrel shifters and
single-cycle multiply.   Integer divides can take a few cycles.

Such are the advances in electronics that you get all this capability
for less than the cost and power of an 8-bit CPU from 15 years ago.

I'm waiting on delivery of one of the Parallela multicore systems
from Adapteva.   It has an ARM supervisor running linux and multicore 
RISC chips with FPUs.  More number crunching power than I should ever
need.

For now, I just appreciate the ability  of the CM4 to run fairly
simple IIR and FIR filters using floating point coefficients I
generate with Matlab.

Mark Borgerson

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <c9ece8pp43md4q71b1047451sejrfabfes@4ax.com>, 
jonk@infinitefactors.org says...
> 
<<SNIP>>
> 
> I'll give another example of my mindset. The current spate of
> multi-GHz x86 processors from Intel are fabricated with
> feature sizes and GTL technology (unless they've got
> something still newer since I last looked) that would permit
> the production of a VERY LOW power 100MHz laptop that could
> easily run for quite some time using nothing more than a few
> AA batteries. Nothing special. Just cheap Costco alkalines.

OK, so you can run the CPU with a few milliwatts. Can you do
anything other than a reflective LCD display?  Lighting
up even an 11" display could suck those  AA cells dry
pretty quickly.
> (In fact, it was done once with the HP Omnibook 300/Win
> 3.1... but with older feature sizes.) The current technology
> would wipe the floor with that older HP Omnibook, which
> itself put Windows completely in ROM (no boot from secondary
> storage) and would run for weeks on AA batteries available
> anywhere in the world. I need nothing more than a 80386 using
> those feature sizes -- no FP -- and running at 66MHz to
> 100MHz for word processing. The nice thing about that
> specific Omnibook (and none of the others) is that there was
> no special battery technology, it weighed almost nothing,
> included a wonderful pop-out mouse built in, and required
> nothing special when you closed it. It just shut off all
> power except and only what was required to retain the static
> ram. So when I opened it, I was exactly where I left off --
> cursor, etc -- with exactly 0 seconds wait. When someone
> asked me a question, I closed it, answered the question,
> opened the laptop, and just kept on going. Weight was VERY
> low -- lower than any laptop I'm aware of today.

Sounds sort of like a MacBook Air without the WIFI and
11" LCD screen.
> 
> But their is no longer a marketplace for this. So I can only
> get laptops with MUCH MUCH shorter active runtimes, despite
> huge advances in battery technology (for much more cost) and
> despite huge advances in FAB technology (which could be used
> to greatly reduce power consumption from that time.)

I never did get an estimate on battery life for my OLPC laptop.
I suspect that the onboard wifi contributed about half the
power drain.  I also suspect than the older Omnibooks didn't
have either Ethernet or Wifi active most of the time.  Those
two alone will suck up a couple of AA cells pretty quickly.

You can easily run an ARM CM4 on an average power of 15mA.
That should give you at least 100 hours off AA cells---it's
the peripherals that people expect today that kill the batteries.

I've done a lot of low-power stuff---instruments that sit
on oceanographic moorings  for a year at a time.  Displays
aren't used and couldn't be continuously powered.  The big
power suckers are the storage medium---at 200MB per day,
alkaline cells wouldn't cut it.  We end up using Lithium
primary cells, which makes shipping units and batteries
a true PITA!


Mark Borgerson

Reply by ●January 4, 20132013-01-04

Mark Borgerson <mborgerson@comcast.net> wrote:
> IIRC, the Cortex M4 instructions which would cause the greatest
> variation in interrupt latency (load and store multiple and divide)
> are, themselves, interruptible.  I would guess that the interrupt
> latency variation would be on the order of 1 to 2 cycle times---
> or about 12.5 nSec for a 168MHz clock.  The overall latency is
> listed as 12 clock cycles or about 60-70nSec.

At least on the M3, the interrupt tail-chaining optimization can vary the 
latency by up to six cycles if I'm reading the documentation right (if 
there is a pending interrupt when the CPU is leaving an ISR, it will skip 
unstacking and immediately restacking the CPU registers). I don't know if 
there is a way to turn off this feature, and I assume it is also present 
on the M4. The newer and faster Cortexes also have various flash 
acceleration mechanisms that are growing ever closer to full caches, but 
those can at least be turned off (at the expense of increased latency).

-a