Looking for ARM system with RTOS| page 2

Reply by Tim Wescott ●January 2, 20132013-01-02

On Wed, 02 Jan 2013 11:38:33 -0800, Jon Kirwan wrote:

> On Wed, 02 Jan 2013 13:24:06 -0600, Tim Wescott <tim@seemywebsite.com>
> wrote:
> 
>><snip>
>>Another thought, if you really want to dive off the deep end, if you
>>absolutely need real-time performance on that USB port, and if you can
>>control what all gets plugged into it:
>>
>>Things like USB host stacks are often huge and expensive because they
>>have to work with _everything_ that might get attached to them.  If you
>>want to become a freaking USB guru, you may find that writing a
>>real-time host stack that _just_ talks to the _one_ device that you're
>>going to hang on the thing is a lot easier (but still not at all
>>trivial) than trying to write an entire compliant well-working USB
>>stack.
>>
>>But pay attention: I'm not entirely sure that USB itself is real-time.
>>There are some protocols (TCP/IP, for instance) that are inherently non-
>>real-time.  If you USB is one of these then it doesn't matter what OS
>>you're using: you are never going to achieve real-timeliness.
> 
> I remember, that week I was reading through the USB 2.0 spec, that there
> is supposed to be a 1ms queue/timer on the host side that a slave device
> can expect to use. My recollection is that you can't guarantee where,
> within the 1ms period until the next one, your service lays. But you
> can, if I recall right, expect some service in that 1ms window at some
> point. [But I don't remember what happens when/if several (and too many)
> such services that are queued for one particular 1ms event happen to
> require long transfers and might push some requests beyond the window.]
> 
> So, combined with a priori knowledge about competing devices, it's
> possible that some USB service could be guaranteed each 1ms tick, but
> that the variability in when the service was provided would also be
> about 1ms as well.
> 
> I haven't read a single word of the 3.0 spec. No idea there.

Interesting.  I would certainly hesitate a good long time before I made 
anyone any promises that depended on getting anything resembling real-
time performance out of USB.

Yes, I know you can run audio through it -- but I'm not sure how _good_ 
of audio you can run through it, how much of the audio is working because 
it's really real time and how much is just deep FIFO's pasted over a 
bunch of problems, or how far away you can get from the optimized-for-
audio software paths in the OS before things break down.

-- 
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

Reply by Lanarcam ●January 2, 20132013-01-02

Le 02/01/2013 22:37, Tim Wescott a &eacute;crit :
> On Wed, 02 Jan 2013 11:38:33 -0800, Jon Kirwan wrote:
>
>> On Wed, 02 Jan 2013 13:24:06 -0600, Tim Wescott <tim@seemywebsite.com>
>> wrote:
>>
>>> <snip>
>>> Another thought, if you really want to dive off the deep end, if you
>>> absolutely need real-time performance on that USB port, and if you can
>>> control what all gets plugged into it:
>>>
>>> Things like USB host stacks are often huge and expensive because they
>>> have to work with _everything_ that might get attached to them.  If you
>>> want to become a freaking USB guru, you may find that writing a
>>> real-time host stack that _just_ talks to the _one_ device that you're
>>> going to hang on the thing is a lot easier (but still not at all
>>> trivial) than trying to write an entire compliant well-working USB
>>> stack.
>>>
>>> But pay attention: I'm not entirely sure that USB itself is real-time.
>>> There are some protocols (TCP/IP, for instance) that are inherently non-
>>> real-time.  If you USB is one of these then it doesn't matter what OS
>>> you're using: you are never going to achieve real-timeliness.
>>
>> I remember, that week I was reading through the USB 2.0 spec, that there
>> is supposed to be a 1ms queue/timer on the host side that a slave device
>> can expect to use. My recollection is that you can't guarantee where,
>> within the 1ms period until the next one, your service lays. But you
>> can, if I recall right, expect some service in that 1ms window at some
>> point. [But I don't remember what happens when/if several (and too many)
>> such services that are queued for one particular 1ms event happen to
>> require long transfers and might push some requests beyond the window.]
>>
>> So, combined with a priori knowledge about competing devices, it's
>> possible that some USB service could be guaranteed each 1ms tick, but
>> that the variability in when the service was provided would also be
>> about 1ms as well.
>>
>> I haven't read a single word of the 3.0 spec. No idea there.
>
> Interesting.  I would certainly hesitate a good long time before I made
> anyone any promises that depended on getting anything resembling real-
> time performance out of USB.
>
> Yes, I know you can run audio through it -- but I'm not sure how _good_
> of audio you can run through it, how much of the audio is working because
> it's really real time and how much is just deep FIFO's pasted over a
> bunch of problems, or how far away you can get from the optimized-for-
> audio software paths in the OS before things break down.
>
I don't know how much "realtime" isochronous transfers are, but here,
they say that "Isochronous transfers are used to transfer data
in real-time between host and device. When an isochronous
endpoint is set up by the host, the host allocates a specific
amount of bandwidth to the isochronous endpoint, and it
regularly performs an IN- or OUT-transfer on that endpoint.
For example, the host may OUT 1 KByte of data every 125 us
to the device. Since a fixed and limited amount of bandwidth
has been allocated, there is no time to resend data if anything
goes wrong. The data has a CRC as normal, but if the receiving
side detects an error there is no resend mechanism."

http://www.edn.com/design/consumer/4376143/Fundamentals-of-USB-Audio

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <9289e8p4ecr3qalegrs5avpq9nmk1ap8jb@4ax.com>, 
jonk@infinitefactors.org says...
> 
> On Wed, 2 Jan 2013 12:52:47 -0800, Mark Borgerson
> <mborgerson@comcast.net> wrote:
> 
> >In article <l996e8dcrons0s9d6104r0t500fra0c0t2@4ax.com>, 
> >jonk@infinitefactors.org says...
> >> 
> >> On Tue, 01 Jan 2013 12:09:40 +0200, upsidedown@downunder.com
> >> wrote:
> >> 
> >> >On Tue, 1 Jan 2013 09:54:30 +0800, "Bruce Varley" <bv@NoSpam.com>
> >> >wrote:
> >> >
> >> >>I need:
> >> >>
> >> >>o  CPU clock 200MHz or higher.
> >> >>
> >> >>o  2 serial ports, with access to the logic level lines on at least one (LV 
> >> >>OK).
> >> >>
> >> >>o  USB support. Socket support also would  be nice, not essential.
> >> >>
> >> >>o  Some sort of file system.
> >> >>
> >> >>o  Guaranteed turnround of 10mS, even lower would be nice. My ARM Linux 
> >> >>won'd do better than 20.
<<SNIP>>
> >> 
> >> 10ms turnaround would be... unacceptable.
> >> 
> >I'm a bit puzzled here.  I usually read '10ms'   as 10 milliseconds.
> 
> As do I.
> 
> >That seems like a lot of time for most embedded systems RTOS
> >variants, which have task switch times in the low microseconds
> >on chips like 160MHz  ARM-Cortex STM32s.
> 
> I was using a 20ns cycle time ADSP-21xx processor (50MHz.)
> It's a DSP with fixed cycle counts (1) for each instruction
> and a guaranteed interrupt latency that NEVER varies (with
> certain, inconsequential [to my application] conditions being
> met.)
> 
> >10milliseconds would certainly be too long a response time on
> >many of the instruments I've developed--none of which use
> >an RTOS.   I'm just now starting to play around with 
> >ChiBios and UCoS-II  on the STM32 chips.
> 
> In measurement instruments, which may be used in closed loop
> control systems, predictability (both in terms of phase delay
> relative to the sensor observation and also in terms of the
> variability allowed in that phase delay) is vital.
> 
> I shoot for (and achieve where it is important) variability
> that is measured as 0, or if forced in very small integers >0
> like 1 or maybe 2, of cycle variation... measurement to
> measurement... both in sampling the sensor as well as in
> outputting it via a DAC. (I can't help what happens after.)
> In the best of all cases, I implement the closed loop control
> in the instrument, as well, so that there is no variability
> caused by an external ADC and remaining system. In that case,
> I drive the 0-100% control with similar attention to
> precision control of the external device (heater, boule
> puller, etc.) I also go to the trouble to ensure, where
> branching code exists, that each branch takes exactly the
> same number of cycles.
> 
> I very much dislike, in cases like this, devices with varying
> interrupt latencies (which is almost guaranteed to happen if
> the processor has instructions with varying execution time.)
> I can control my code and the number of cycles each edge of
> it may take, but the hardware latency is out of my control.
> So I look for processors where it is predictable, if I need
> that.
> 
> An STM32 would not qualify in the case I am thinking about.
> 
IIRC, the Cortex M4 instructions which would cause the greatest
variation in interrupt latency (load and store multiple and divide)
are, themselves, interruptible.  I would guess that the interrupt
latency variation would be on the order of 1 to 2 cycle times---
or about 12.5 nSec for a 168MHz clock.  The overall latency is
listed as 12 clock cycles or about 60-70nSec.


I can see that multiple-cycle instructions  with variable execution time 
inside the interrupt handler could cause phase variations in the output.  
It might requirem more work to eliminate them than would be the case
with a DSP having only a few rare cases to consider.

If you're using a DAC in the loop and want consistent phase delays,
does that require a flash DAC?  With a successive approximation DAC, the 
delay until you get the desired output would seem to depend on the
value output unless there is a fast sample-and-hold between the
DAC and the control system.

If you want outputs free of all phase jitter, a sample and hold
triggered by a hardware clock could solve the problem.  The problem
then becomes what synchronization delays are acceptable.

Mark Borgerson

Reply by Jon Kirwan ●January 3, 20132013-01-03

On Wed, 2 Jan 2013 22:43:15 -0800, Mark Borgerson
<mborgerson@comcast.net> wrote:

>In article <9289e8p4ecr3qalegrs5avpq9nmk1ap8jb@4ax.com>, 
>jonk@infinitefactors.org says...
>> 
>> On Wed, 2 Jan 2013 12:52:47 -0800, Mark Borgerson
>> <mborgerson@comcast.net> wrote:
>> 
>> >In article <l996e8dcrons0s9d6104r0t500fra0c0t2@4ax.com>, 
>> >jonk@infinitefactors.org says...
>> >> 
>> >> On Tue, 01 Jan 2013 12:09:40 +0200, upsidedown@downunder.com
>> >> wrote:
>> >> 
>> >> >On Tue, 1 Jan 2013 09:54:30 +0800, "Bruce Varley" <bv@NoSpam.com>
>> >> >wrote:
>> >> >
>> >> >>I need:
>> >> >>
>> >> >>o  CPU clock 200MHz or higher.
>> >> >>
>> >> >>o  2 serial ports, with access to the logic level lines on at least one (LV 
>> >> >>OK).
>> >> >>
>> >> >>o  USB support. Socket support also would  be nice, not essential.
>> >> >>
>> >> >>o  Some sort of file system.
>> >> >>
>> >> >>o  Guaranteed turnround of 10mS, even lower would be nice. My ARM Linux 
>> >> >>won'd do better than 20.
><<SNIP>>
>> >> 
>> >> 10ms turnaround would be... unacceptable.
>> >> 
>> >I'm a bit puzzled here.  I usually read '10ms'   as 10 milliseconds.
>> 
>> As do I.
>> 
>> >That seems like a lot of time for most embedded systems RTOS
>> >variants, which have task switch times in the low microseconds
>> >on chips like 160MHz  ARM-Cortex STM32s.
>> 
>> I was using a 20ns cycle time ADSP-21xx processor (50MHz.)
>> It's a DSP with fixed cycle counts (1) for each instruction
>> and a guaranteed interrupt latency that NEVER varies (with
>> certain, inconsequential [to my application] conditions being
>> met.)
>> 
>> >10milliseconds would certainly be too long a response time on
>> >many of the instruments I've developed--none of which use
>> >an RTOS.   I'm just now starting to play around with 
>> >ChiBios and UCoS-II  on the STM32 chips.
>> 
>> In measurement instruments, which may be used in closed loop
>> control systems, predictability (both in terms of phase delay
>> relative to the sensor observation and also in terms of the
>> variability allowed in that phase delay) is vital.
>> 
>> I shoot for (and achieve where it is important) variability
>> that is measured as 0, or if forced in very small integers >0
>> like 1 or maybe 2, of cycle variation... measurement to
>> measurement... both in sampling the sensor as well as in
>> outputting it via a DAC. (I can't help what happens after.)
>> In the best of all cases, I implement the closed loop control
>> in the instrument, as well, so that there is no variability
>> caused by an external ADC and remaining system. In that case,
>> I drive the 0-100% control with similar attention to
>> precision control of the external device (heater, boule
>> puller, etc.) I also go to the trouble to ensure, where
>> branching code exists, that each branch takes exactly the
>> same number of cycles.
>> 
>> I very much dislike, in cases like this, devices with varying
>> interrupt latencies (which is almost guaranteed to happen if
>> the processor has instructions with varying execution time.)
>> I can control my code and the number of cycles each edge of
>> it may take, but the hardware latency is out of my control.
>> So I look for processors where it is predictable, if I need
>> that.
>> 
>> An STM32 would not qualify in the case I am thinking about.
>> 
>IIRC, the Cortex M4 instructions which would cause the greatest
>variation in interrupt latency (load and store multiple and divide)
>are, themselves, interruptible.  I would guess that the interrupt
>latency variation would be on the order of 1 to 2 cycle times---
>or about 12.5 nSec for a 168MHz clock.  The overall latency is
>listed as 12 clock cycles or about 60-70nSec.

I gained some slightly useful benefits by having exactly 0
cycle variation in the application I'm talking about. One
cycle (20ns in that application) of variation would have made
a difference to me. The fact that I didn't have to add
hardware to gain that tiny advantage ALSO was a useful
benefit.

In the M4, there is also a pipeline and, if I remember,
"faults" can occur not only in one stage. (I might be wrong
about that.) You have to consider everything -- instruction
faults (memory, etc.) But I admit I'm pretty ignorant of the
M4, too.

>I can see that multiple-cycle instructions  with variable execution time 
>inside the interrupt handler could cause phase variations in the output.  
>It might requirem more work to eliminate them than would be the case
>with a DSP having only a few rare cases to consider.
>
>If you're using a DAC in the loop and want consistent phase delays,
>does that require a flash DAC?  With a successive approximation DAC, the 
>delay until you get the desired output would seem to depend on the
>value output unless there is a fast sample-and-hold between the
>DAC and the control system.

I added the full closed loop control PID into the instrument.
(It didn't have the ability beforehand.) In doing so, there
was no DAC involved at that stage, anymore.

>If you want outputs free of all phase jitter, a sample and hold
>triggered by a hardware clock could solve the problem.  The problem
>then becomes what synchronization delays are acceptable.

Price, size, power, etc., all mattered. Very competitive
marketplace in that case.

Jon

Reply by Rocky ●January 3, 20132013-01-03

On Thursday, January 3, 2013 8:43:15 AM UTC+2, Mark Borgerson wrote:
> If you're using a DAC in the loop and want consistent phase delays,
> does that require a flash DAC?  With a successive approximation DAC, the 

successive approximation DAC ??

Reply by Bruce Varley ●January 3, 20132013-01-03

"Mark Borgerson" <mborgerson@comcast.net> wrote in message 
news:MPG.2b4e3dfa12d633959899c8@news.eternal-september.org...
> In article <l996e8dcrons0s9d6104r0t500fra0c0t2@4ax.com>,
> jonk@infinitefactors.org says...
>>
>> On Tue, 01 Jan 2013 12:09:40 +0200, upsidedown@downunder.com
>> wrote:
>>
>> >On Tue, 1 Jan 2013 09:54:30 +0800, "Bruce Varley" <bv@NoSpam.com>
>> >wrote:
>> >
>> >>I need:
>> >>
>> >>o  CPU clock 200MHz or higher.
>> >>
>> >>o  2 serial ports, with access to the logic level lines on at least one 
>> >>(LV
>> >>OK).
>> >>
>> >>o  USB support. Socket support also would  be nice, not essential.
>> >>
>> >>o  Some sort of file system.
>> >>
>> >>o  Guaranteed turnround of 10mS, even lower would be nice. My ARM Linux
>> >>won'd do better than 20.
>> >
>> >What Linux version and what scheduling policy are you using to get 20
>> >ms ?
>> >
>> >If you need 100 % hard real time performance no general purpose OS
>> >will do. If you can live with soft real time (99.9... % reliability),
>> >general purpose OSes such as Windows/Linux can be used,
>>
>> I pretty much agree with this. I write my own O/S and, not
>> infrequently, use a 2 microsecond resolution process timer (I
>> can switch tasks in 5 cycles -- 100 nanoseconds for a task
>> switch on the ADSP-21xx I used it on -- with no variation in
>> timing, guaranteed fixed phase delay.) But the code and the
>> task switcher are designed together to work well. In that
>> case, my shortest task-to-task delay was 20 microseconds. But
>> I also had to control an external ADC at about 1.5Ms/s and
>> the precision there had to be no more variation, sample to
>> sample, than about 5 nanoseconds. That was achieved not with
>> hardware, but with software running on it.
>>
>> No file system, though. All queue insertions were done
>> between process starts, so the time involved was hidden.
>>
>> >It should be noted that in any priority based scheduling, only the
>> >highest priority task response times (interrupts, rescheduling) can be
>> >guarantied, unless you have full control of the execution time of that
>> >high priority task(s).
>>
>> That's also a sensible comment.
>>
>> What I did was to carefully craft the processes, knowing
>> their durations, and lay it all out on a timing diagram so
>> that I could guarantee each of them their exact start times.
>> The end times weren't quite as important, so long as I
>> ensured they were done.
>>
>> A delta queue design helps. The timer only sees the top
>> process (all other processes are queued with 'delta' values
>> relative to the processes ahead of them) and decrements that
>> time value. When it drops to zero, the proecss is started.
>> All remaining processes only have delta timers relative to
>> that, so again the timer only needs to decrement the top
>> timer value. Keeps 'variability' in timing to zero if you are
>> careful (assembly) or near zero if it's in C.
>>
>> The O/S I wrote allows me to include or exclude features at
>> compile-time, too. I can enable or disable pre-emption,
>> enable or disable priorities, enable or disable semaphores,
>> and so on.
>>
>> But I've never cared about including a file system for it --
>> I don't need that on measurement instrumentation. What I do
>> need is precision sampling and absolute guarantees on DAC or
>> measurement outputs with FFT results you'd expect, so that an
>> external device (boule puller) can have closed loop control
>> with well designed control algorithms behaving as expected
>> from the math.
>>
>> 10ms turnaround would be... unacceptable.
>>
> I'm a bit puzzled here.  I usually read '10ms'   as 10 milliseconds.
> That seems like a lot of time for most embedded systems RTOS
> variants, which have task switch times in the low microseconds
> on chips like 160MHz  ARM-Cortex STM32s.
>
> 10milliseconds would certainly be too long a response time on
> many of the instruments I've developed--none of which use
> an RTOS.   I'm just now starting to play around with
> ChiBios and UCoS-II  on the STM32 chips.
>
>
> Mark Borgerson
>
Yes, it's hard for me to understand too, particularly given that the kernel 
timer.c source code indicates that the base clock rate for the kernel is in 
the very low mS range (several build options that I can't decipher). But a 
standard interval timer set for 10mS or less repeat time delivers 20mS. The 
supplier apparently doesn't want to buy into timing issues, my queries 
aren't being answered. They've probably just given up, can't blame them.

Anyway, enough! I just want a system with heaps of CPU and timing headroom 
so I can  implement my software knowing that capacity won't be a problem.

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <d918d586-960d-4c43-a7fc-eaa95ec8e71e@googlegroups.com>, 
RobertGush@gmail.com says...
> 
> On Thursday, January 3, 2013 8:43:15 AM UTC+2, Mark Borgerson wrote:
> > If you're using a DAC in the loop and want consistent phase delays,
> > does that require a flash DAC?  With a successive approximation DAC, the 
> 
> successive approximation DAC ??

Ooops---got my DACs and ADCs mixed up.   Even serial interface DACs 
generally have input latches so that there is  a single step to
the new output voltage.

It does occur to me that if there is any capacitive loading on
the DAC output, the input to the next stage will have some variation
in the time at which it settles at the final voltage.  The 
variation will depend on the size of the DAC output step
from the last voltage.  I think controlling that phase delay
will be as important as controlling jitter in the DAC output
timing.

Mark Borgerson

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <u3eae8d0bt9c51qq0tbp30mucskp1o4csd@4ax.com>, 
jonk@infinitefactors.org says...
> 
> On Wed, 2 Jan 2013 22:43:15 -0800, Mark Borgerson
> <mborgerson@comcast.net> wrote:
> 
> >In article <9289e8p4ecr3qalegrs5avpq9nmk1ap8jb@4ax.com>, 
> >jonk@infinitefactors.org says...
> >> 
> >> On Wed, 2 Jan 2013 12:52:47 -0800, Mark Borgerson
> >> <mborgerson@comcast.net> wrote:
> >> 
> >> >In article <l996e8dcrons0s9d6104r0t500fra0c0t2@4ax.com>, 
> >> >jonk@infinitefactors.org says...
> >> >> 
> >> >> On Tue, 01 Jan 2013 12:09:40 +0200, upsidedown@downunder.com
> >> >> wrote:
> >> >> 
> >> >> >On Tue, 1 Jan 2013 09:54:30 +0800, "Bruce Varley" <bv@NoSpam.com>
> >> >> >wrote:
> >> >> >
> >> >> >>I need:
> >> >> >>
> >> >> >>o  CPU clock 200MHz or higher.
> >> >> >>
> >> >> >>o  2 serial ports, with access to the logic level lines on at least one (LV 
> >> >> >>OK).
> >> >> >>
> >> >> >>o  USB support. Socket support also would  be nice, not essential.
> >> >> >>
> >> >> >>o  Some sort of file system.
> >> >> >>
> >> >> >>o  Guaranteed turnround of 10mS, even lower would be nice. My ARM Linux 
> >> >> >>won'd do better than 20.
> ><<SNIP>>
> >> >> 
> >> >> 10ms turnaround would be... unacceptable.
> >> >> 
> >> >I'm a bit puzzled here.  I usually read '10ms'   as 10 milliseconds.
> >> 
> >> As do I.
> >> 
> >> >That seems like a lot of time for most embedded systems RTOS
> >> >variants, which have task switch times in the low microseconds
> >> >on chips like 160MHz  ARM-Cortex STM32s.
> >> 
> >> I was using a 20ns cycle time ADSP-21xx processor (50MHz.)
> >> It's a DSP with fixed cycle counts (1) for each instruction
> >> and a guaranteed interrupt latency that NEVER varies (with
> >> certain, inconsequential [to my application] conditions being
> >> met.)
> >> 
> >> >10milliseconds would certainly be too long a response time on
> >> >many of the instruments I've developed--none of which use
> >> >an RTOS.   I'm just now starting to play around with 
> >> >ChiBios and UCoS-II  on the STM32 chips.
> >> 
> >> In measurement instruments, which may be used in closed loop
> >> control systems, predictability (both in terms of phase delay
> >> relative to the sensor observation and also in terms of the
> >> variability allowed in that phase delay) is vital.
> >> 
> >> I shoot for (and achieve where it is important) variability
> >> that is measured as 0, or if forced in very small integers >0
> >> like 1 or maybe 2, of cycle variation... measurement to
> >> measurement... both in sampling the sensor as well as in
> >> outputting it via a DAC. (I can't help what happens after.)
> >> In the best of all cases, I implement the closed loop control
> >> in the instrument, as well, so that there is no variability
> >> caused by an external ADC and remaining system. In that case,
> >> I drive the 0-100% control with similar attention to
> >> precision control of the external device (heater, boule
> >> puller, etc.) I also go to the trouble to ensure, where
> >> branching code exists, that each branch takes exactly the
> >> same number of cycles.
> >> 
> >> I very much dislike, in cases like this, devices with varying
> >> interrupt latencies (which is almost guaranteed to happen if
> >> the processor has instructions with varying execution time.)
> >> I can control my code and the number of cycles each edge of
> >> it may take, but the hardware latency is out of my control.
> >> So I look for processors where it is predictable, if I need
> >> that.
> >> 
> >> An STM32 would not qualify in the case I am thinking about.
> >> 
> >IIRC, the Cortex M4 instructions which would cause the greatest
> >variation in interrupt latency (load and store multiple and divide)
> >are, themselves, interruptible.  I would guess that the interrupt
> >latency variation would be on the order of 1 to 2 cycle times---
> >or about 12.5 nSec for a 168MHz clock.  The overall latency is
> >listed as 12 clock cycles or about 60-70nSec.
> 
> I gained some slightly useful benefits by having exactly 0
> cycle variation in the application I'm talking about. One
> cycle (20ns in that application) of variation would have made
> a difference to me. The fact that I didn't have to add
> hardware to gain that tiny advantage ALSO was a useful
> benefit.
> 
> In the M4, there is also a pipeline and, if I remember,
> "faults" can occur not only in one stage. (I might be wrong
> about that.) You have to consider everything -- instruction
> faults (memory, etc.) But I admit I'm pretty ignorant of the
> M4, too.
> 
> >I can see that multiple-cycle instructions  with variable execution time 
> >inside the interrupt handler could cause phase variations in the output.  
> >It might requirem more work to eliminate them than would be the case
> >with a DSP having only a few rare cases to consider.
> >
> >If you're using a DAC in the loop and want consistent phase delays,
> >does that require a flash DAC?  With a successive approximation DAC, the 
> >delay until you get the desired output would seem to depend on the
> >value output unless there is a fast sample-and-hold between the
> >DAC and the control system.
> 
> I added the full closed loop control PID into the instrument.
> (It didn't have the ability beforehand.) In doing so, there
> was no DAC involved at that stage, anymore.
> 
> >If you want outputs free of all phase jitter, a sample and hold
> >triggered by a hardware clock could solve the problem.  The problem
> >then becomes what synchronization delays are acceptable.
> 
> Price, size, power, etc., all mattered. Very competitive
> marketplace in that case.
> 
> Jon

The 50Mhz ADSP-21061KSZ-200-ND  is $101.43 qty 1 at Digikey.  The
168Mhz STM32F407 is about $12.  That seems pretty competitive to
me ;-)  How much work would it take to tune the STM32 and would you
sell enough with an $80 lower price to be worth the effort.

I suspect the STM32 is lower in power at 168Mhz than the DSP
at 50MHz, but I haven't verified that guess.

Mark Borgerson

Reply by Mark Borgerson ●January 3, 20132013-01-03

In article <5dCdnY9pU8zsCHjNnZ2dnUVZ7qydnZ2d@westnet.com.au>, 
bv@NoSpam.com says...
> 
> "Mark Borgerson" <mborgerson@comcast.net> wrote in message 
> news:MPG.2b4e3dfa12d633959899c8@news.eternal-september.org...
> > In article <l996e8dcrons0s9d6104r0t500fra0c0t2@4ax.com>,
> > jonk@infinitefactors.org says...
> >>
> >> On Tue, 01 Jan 2013 12:09:40 +0200, upsidedown@downunder.com
> >> wrote:
> >>
> >> >On Tue, 1 Jan 2013 09:54:30 +0800, "Bruce Varley" <bv@NoSpam.com>
> >> >wrote:
> >> >
> >> >>I need:
> >> >>
> >> >>o  CPU clock 200MHz or higher.
> >> >>
> >> >>o  2 serial ports, with access to the logic level lines on at least one 
> >> >>(LV
> >> >>OK).
> >> >>
> >> >>o  USB support. Socket support also would  be nice, not essential.
> >> >>
> >> >>o  Some sort of file system.
> >> >>
> >> >>o  Guaranteed turnround of 10mS, even lower would be nice. My ARM Linux
> >> >>won'd do better than 20.
> >> >
> >> >What Linux version and what scheduling policy are you using to get 20
> >> >ms ?
> >> >
> >> >If you need 100 % hard real time performance no general purpose OS
> >> >will do. If you can live with soft real time (99.9... % reliability),
> >> >general purpose OSes such as Windows/Linux can be used,
> >>
> >> I pretty much agree with this. I write my own O/S and, not
> >> infrequently, use a 2 microsecond resolution process timer (I
> >> can switch tasks in 5 cycles -- 100 nanoseconds for a task
> >> switch on the ADSP-21xx I used it on -- with no variation in
> >> timing, guaranteed fixed phase delay.) But the code and the
> >> task switcher are designed together to work well. In that
> >> case, my shortest task-to-task delay was 20 microseconds. But
> >> I also had to control an external ADC at about 1.5Ms/s and
> >> the precision there had to be no more variation, sample to
> >> sample, than about 5 nanoseconds. That was achieved not with
> >> hardware, but with software running on it.
> >>
> >> No file system, though. All queue insertions were done
> >> between process starts, so the time involved was hidden.
> >>
> >> >It should be noted that in any priority based scheduling, only the
> >> >highest priority task response times (interrupts, rescheduling) can be
> >> >guarantied, unless you have full control of the execution time of that
> >> >high priority task(s).
> >>
> >> That's also a sensible comment.
> >>
> >> What I did was to carefully craft the processes, knowing
> >> their durations, and lay it all out on a timing diagram so
> >> that I could guarantee each of them their exact start times.
> >> The end times weren't quite as important, so long as I
> >> ensured they were done.
> >>
> >> A delta queue design helps. The timer only sees the top
> >> process (all other processes are queued with 'delta' values
> >> relative to the processes ahead of them) and decrements that
> >> time value. When it drops to zero, the proecss is started.
> >> All remaining processes only have delta timers relative to
> >> that, so again the timer only needs to decrement the top
> >> timer value. Keeps 'variability' in timing to zero if you are
> >> careful (assembly) or near zero if it's in C.
> >>
> >> The O/S I wrote allows me to include or exclude features at
> >> compile-time, too. I can enable or disable pre-emption,
> >> enable or disable priorities, enable or disable semaphores,
> >> and so on.
> >>
> >> But I've never cared about including a file system for it --
> >> I don't need that on measurement instrumentation. What I do
> >> need is precision sampling and absolute guarantees on DAC or
> >> measurement outputs with FFT results you'd expect, so that an
> >> external device (boule puller) can have closed loop control
> >> with well designed control algorithms behaving as expected
> >> from the math.
> >>
> >> 10ms turnaround would be... unacceptable.
> >>
> > I'm a bit puzzled here.  I usually read '10ms'   as 10 milliseconds.
> > That seems like a lot of time for most embedded systems RTOS
> > variants, which have task switch times in the low microseconds
> > on chips like 160MHz  ARM-Cortex STM32s.
> >
> > 10milliseconds would certainly be too long a response time on
> > many of the instruments I've developed--none of which use
> > an RTOS.   I'm just now starting to play around with
> > ChiBios and UCoS-II  on the STM32 chips.
> >
> >
> > Mark Borgerson
> >
> Yes, it's hard for me to understand too, particularly given that the kernel 
> timer.c source code indicates that the base clock rate for the kernel is in 
> the very low mS range (several build options that I can't decipher). But a 
> standard interval timer set for 10mS or less repeat time delivers 20mS. The 
> supplier apparently doesn't want to buy into timing issues, my queries 
> aren't being answered. They've probably just given up, can't blame them.
> 
> Anyway, enough! I just want a system with heaps of CPU and timing headroom 
> so I can  implement my software knowing that capacity won't be a problem. 

With an RTOS, as opposed to Linux,  whatever chore you want to execute
with 10mSec timing would be set to trigger on a hardware timer interrupt
and run either in the IRQ handler or a high-priority chore triggered
by the timer IRQ.  With proper attention to  interrupt priorities
and handler coding, it should be possible to execute that chore
with a jitter of a few hundred nanoseconds.

Then you are faced with balancing the timing precision of the RTOS
against the rich set of support services that come with Linux (TCP
stacks, USB stacks, file systems, display options, etc. etc.).

Most of the things I do are tilted in favor of simplicity and timing
precision.   I went through a few years of  stuffing Linux into
an autonomous vehicle guidance system.  Timing issues reared up
often enough to make that a frustrating job at times.   I originally
advocated for an RTOS, but was overruled by other engineers with 
a Linux background.  The alternative would have been an RTOS with
good file and networking support---but they are available, and probably
more common in military aerospace systems than Linux.

Mark Borgerson

Reply by ●January 3, 20132013-01-03

On Thu, 3 Jan 2013 22:28:00 +0800, "Bruce Varley" <bv@NoSpam.com>
wrote:

>Yes, it's hard for me to understand too, particularly given that the kernel 
>timer.c source code indicates that the base clock rate for the kernel is in 
>the very low mS range (several build options that I can't decipher). But a 
>standard interval timer set for 10mS or less repeat time delivers 20mS. The 
>supplier apparently doesn't want to buy into timing issues, my queries 
>aren't being answered. They've probably just given up, can't blame them.

Last century I made some timing accuracy tests on 167 or 333 MHz x86
platforms (using the TSC) and was able to reach less than +/- 1 ms
jitter with 99.xx % reliability with standard Windows NT4.  Of course,
this was a headless (no mouse or keyboard) with carefully selected
applications and hardware (drivers) and with multimedia timers
enabled.

On standard Linux 2.4 kernels compiled with HZ=100 (10 ms clock
interrupt rate), the performance was very bad compared to the NT4
case. Compiling the kernel with HZ=1000 (1 ms) and applying the RT and
kernel pre-empt patches, both platforms performed in the similar way.

As far as I understand Linux 2.6 essentially contains the RT and
pre-empt patches and when compiled with HZ=1000 or higher should give
similar performance.

Running the system clock and hence timer interrupt services and thus
rescheduling at say above 1000 Hz should not be a problem, since old
PDP-11 machines with 1 us core memory time cycle times (1 MHz) used 50
Hz  (20 ms) clock interrupt rates in RSX-11.

Thus your 20 ms latencies sounds very bad, even for a low resource
processor.