EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Do I need a RTOS?

Started by eeboy December 23, 2008
I am bringing up my first ARM project (based on LM3S1439) and am looking
for some advice. In my particular target application the processor will be
communicating with several peripherals via SPI (2) and UART, maintaining a
file system, keeping track of time and date as well as performing a lot of
GPIO manipulation. The board is also designed with a few external
communications ports so that future additions can be accommodated.

I can see lots of wait states in the manipulation of the GPIO. For example
I might have a function that sets a pin, waits for 500ms, then clears a
pin. I could probably get away with software delays at this moment but,
given the fact that the design can scale, I don't want to introduce this
inefficiency now only to have to remove it a few months down the road. That
500ms could be precious later on. I have several timers at my disposal but
I can foresee 'running out' in the future. I can think of some elaborate
solutions to this problem but it makes for messy code. 

In general I was thinking I could implement a system tick which generated
an interrupt at a known interval (say 1ms). Upon each tick, I could examine
counters associated with the periodic tasks to see if they are due for
execution. The random interrupts would be handled by the specific interrupt
handler for that peripheral (example UART receive). That seems straight
forward to me. However, how best handle (cleanly) the toggling of a pin
after a certain delay? It's not a periodic task. 

Should I use a full blown RTOS? If not how should I structure my
application? I've had a look at FreeRTOS but it looks to be much more than
I need and slightly intimidating.

Any suggestions? Please feel free to speak to me like a child as I am not
extremely knowledgeable of operating systems or software architecture(I am
a EE). Also, if you can point me to or provide examples that would be
extremely helpful! Help me wrap my brain around this problem.

Thanks!


On Tue, 23 Dec 2008 13:10:20 -0600, "eeboy" <jason@jasonorsborn.com>
wrote:

>I am bringing up my first ARM project (based on LM3S1439) and am looking >for some advice. In my particular target application the processor will be >communicating with several peripherals via SPI (2) and UART, maintaining a >file system, keeping track of time and date as well as performing a lot of >GPIO manipulation. The board is also designed with a few external >communications ports so that future additions can be accommodated. > >I can see lots of wait states in the manipulation of the GPIO. For example >I might have a function that sets a pin, waits for 500ms, then clears a >pin. I could probably get away with software delays at this moment but, >given the fact that the design can scale, I don't want to introduce this >inefficiency now only to have to remove it a few months down the road. That >500ms could be precious later on. I have several timers at my disposal but >I can foresee 'running out' in the future. I can think of some elaborate >solutions to this problem but it makes for messy code. > >In general I was thinking I could implement a system tick which generated >an interrupt at a known interval (say 1ms). Upon each tick, I could examine >counters associated with the periodic tasks to see if they are due for >execution. The random interrupts would be handled by the specific interrupt >handler for that peripheral (example UART receive). That seems straight >forward to me. However, how best handle (cleanly) the toggling of a pin >after a certain delay? It's not a periodic task. > >Should I use a full blown RTOS? If not how should I structure my >application? I've had a look at FreeRTOS but it looks to be much more than >I need and slightly intimidating.
I still like John Larkin's comment from <ioekp314l52vs01sv62ec19s5d82agr84k@4ax.com> 'If ever you assign a programmer to do a deep-embedded app, and he says "first, we'll have to write/buy an rtos", fire him instantly.' The whole thread that contains that posting would be germane. You can use the message ID above to hit the advanced search at groups.google.com and bring up the archived thread. With regards to the pin toggle, it depends somewhat on what exactly 500 msecs is: "about half a second" or "500 msec +/- 1 usec." With no other constraints, I'd do as you are close to suggesting: have the function set the GPIO pin and also set a tick register to (using your numbers) 500. When the ticker counts down to zero, reset the GPIO. You could do the decrement-test-reset in the timer interrupt or have that isr touch a master tick and let your main loop observe that event and handle any periodic housekeeping. Throw the characters from the UART's isr into FIFOs and handle them when nothing else is going on. A reasonable overall approach is a main loop with a state machine (switch statement) followed by any "do every time" event handlers, like a switch debouncer. -- Rich Webb Norfolk, VA
>I am bringing up my first ARM project (based on LM3S1439) and am looking > for some advice. In my particular target application the processor will be > communicating with several peripherals via SPI (2) and UART, maintaining a > file system, keeping track of time and date as well as performing a lot of > GPIO manipulation. The board is also designed with a few external > communications ports so that future additions can be accommodated.
It sounds like you have a real mixed bag of requirements, notably the file system which can be moderately complex and require relatively slow to access (flash write times). <snip>
> In general I was thinking I could implement a system tick which generated > an interrupt at a known interval (say 1ms). Upon each tick, I could > examine > counters associated with the periodic tasks to see if they are due for > execution. The random interrupts would be handled by the specific > interrupt > handler for that peripheral (example UART receive). That seems straight > forward to me. However, how best handle (cleanly) the toggling of a pin > after a certain delay? It's not a periodic task.
This type of architecture is perfectly adequate in a lot of cases. There are other alternatives too that don't involve an RTOS, for example a run to completion scheduler, co-routines, or a simple round robin scheduler. One of the main things you have to consider though is the maintainability over time as requirements and hardware inevitably change. Take a look at http://www.freertos.org/FAQWhat.html#WhyUseRTOS for other topics that may sway your decision.
> Should I use a full blown RTOS? If not how should I structure my > application?
FreeRTOS is not a full blown RTOS in the way that VxWorks or QNX are. It is designed for microcontrollers and therefore very small. Its really a real time kernel or real time scheduler.
> I've had a look at FreeRTOS but it looks to be much more than > I need and slightly intimidating.
:o) I think it is about as simple as it could be. Maybe FreeRTOS is not intimidating in itself but the shift required in the mindset required and the concepts of a pre-emptive multitasking system that are - these would be constant for any RTOS solution. I am currently writing some new documentation that is more of a step by step guide, rather than a dry reference. Maybe that will help. Unfortunately its not ready yet.
> Any suggestions? Please feel free to speak to me like a child
Intimidating children is one of my favourite pastimes - LOL. as I am not
> extremely knowledgeable of operating systems or software architecture(I am > a EE). Also, if you can point me to or provide examples that would be > extremely helpful! Help me wrap my brain around this problem.
My suggestion would be to download the FreeRTOS Luminary examples and play around with them for a while. Then you can try sketching together one solution that uses FreeRTOS and one that doesn't, to see which you are most comfortable with. Then go for that. I'm sure personally I would find the FreeRTOS solution the easiest, but I am already familiar with the concepts and way of thinking about a multitasking app. -- Regards, Richard. + http://www.FreeRTOS.org Designed for microcontrollers. More than 7000 downloads per month. + http://www.SafeRTOS.com Certified by T&#4294967295;V as meeting the requirements for safety related systems.
On 2008-12-23, FreeRTOS.org <noemail@given.com> wrote:

> This type of architecture is perfectly adequate in a lot of cases. There > are other alternatives too that don't involve an RTOS, for example a run to > completion scheduler, co-routines, or a simple round robin scheduler.
I recently used "protothreads" for a project and thought it was pretty cool. You can structure your code as if your using a multitasking kernel, but in fact you're using state machines and co-operative tasking: http://www.sics.se/~adam/pt/ In my case there was some hard-realtime stuff going on that required exclusive and intensvie use of interrupts, so the "tasks" could neither use nor disable interrupts ever. As FreeRTOS.org said, "using an RTOS" isn't really a Yes/No question. There's a whole spectrum ranging from coroutines, up through a 2KB bare-bones scheduler all the way up to something like QNX or another "real-time" Unix-like system requiring several MB of RAM just to get started. -- Grant Edwards grante Yow! I wish I was a at sex-starved manicurist visi.com found dead in the Bronx!!
"Grant Edwards" <grante@visi.com> wrote in message 
news:Z4udnS8boYwyz8zUnZ2dnUVZ_gWdnZ2d@posted.visi...
> On 2008-12-23, FreeRTOS.org <noemail@given.com> wrote: > >> This type of architecture is perfectly adequate in a lot of cases. There >> are other alternatives too that don't involve an RTOS, for example a run >> to >> completion scheduler, co-routines, or a simple round robin scheduler. > > I recently used "protothreads" for a project and thought it was > pretty cool. You can structure your code as if your using a > multitasking kernel, but in fact you're using state machines > and co-operative tasking:
Protothreads are a really nice system when RAM is at a real premium. The co-routines provided in FreeRTOS are very similar. My one gripe with them is the heavy use of macros makes debugging a bit of a swine.
> In my case there was some hard-realtime stuff going on that > required exclusive and intensvie use of interrupts, so the > "tasks" could neither use nor disable interrupts ever.
The newer FreeRTOS ports have a full nesting scheme where interrupts above a user definable priority level are unaffected by critical sections - the downside of which is that they also cannot use FreeRTOS API function - but this allows incredibly low jitter in high frequency interrupts. Ideal for apps such as motor control. On a Cortex M3 running at 50MHz I have measured the jitter to be 120ns, which is exactly the jitter that would be introduced by the Cortex interrupt tail chaining, so the RTOS adds zero. The Cortex M3 interrupt model is perfect for this though, other architectures don't do so well. -- Regards, Richard. + http://www.FreeRTOS.org Designed for microcontrollers. More than 7000 downloads per month. + http://www.SafeRTOS.com Certified by T&#4294967295;V as meeting the requirements for safety related systems.
On 2008-12-23, FreeRTOS.org <noemail@given.com> wrote:
> "Grant Edwards" <grante@visi.com> wrote in message > news:Z4udnS8boYwyz8zUnZ2dnUVZ_gWdnZ2d@posted.visi... >> On 2008-12-23, FreeRTOS.org <noemail@given.com> wrote: >> >>> This type of architecture is perfectly adequate in a lot of cases. There >>> are other alternatives too that don't involve an RTOS, for example a run >>> to >>> completion scheduler, co-routines, or a simple round robin scheduler. >> >> I recently used "protothreads" for a project and thought it was >> pretty cool. You can structure your code as if your using a >> multitasking kernel, but in fact you're using state machines >> and co-operative tasking: > > Protothreads are a really nice system when RAM is at a real premium. The > co-routines provided in FreeRTOS are very similar. My one gripe with them > is the heavy use of macros makes debugging a bit of a swine.
True. If you end up with a typo in one of the protothread macro parameters (e.g. PT_WAIT_UNTIL()), it can be a real brain-teaser.
>> In my case there was some hard-realtime stuff going on that >> required exclusive and intensvie use of interrupts, so the >> "tasks" could neither use nor disable interrupts ever. > > The newer FreeRTOS ports have a full nesting scheme where > interrupts above a user definable priority level are > unaffected by critical sections
I presume it requires user-configurable hardware interrupt priorites that also control interrupt nesting (not just the more common "priority" control that determines which pending interrupt gets serviced)?
> - the downside of which is that they also cannot use FreeRTOS > API function - but this allows incredibly low jitter in high > frequency interrupts. Ideal for apps such as motor control. > On a Cortex M3 running at 50MHz I have measured the jitter to > be 120ns, which is exactly the jitter that would be introduced > by the Cortex interrupt tail chaining, so the RTOS adds zero. > The Cortex M3 interrupt model is perfect for this though, > other architectures don't do so well.
-- Grant Edwards grante Yow! JAPAN is a WONDERFUL at planet -- I wonder if we'll visi.com ever reach their level of COMPARATIVE SHOPPING ...
Thank you all for your input. That's exactly what I needed. I've got a lot
to look over and take in. Protothreads looks like it may be a good option.

I stated my generalized thoughts on a suitable scheme but was having
difficulties with the problem where I needed to engage a pin, do something
else for 500ms (exactly) then return. I got to thinking that I could
implement some sort of queue/array that I could add to as I go along. In
that queue I could add a pointer to the function that I wish to execute and
a time to execute:

Task    Function             Time till execution
1       PowerMotorOff        124 ticks
2       StopWaitingForData   1000 ticks
3  ......................

So, on each system tick I would go through and evaluate the array for any
pending tasks which needed to be executed (time decremented to zero). I
guess it's kind of like a call back function. There would be no priority
given to these tasks but I don't think there really needs to be any for my
situation. The majority of these tasks are very simple and require very
little time to execute (like setting a bit in the I/O port register). The
issues that I can see are that I can't pass arguments and the code might
not be as modular as I'd like.

I may be describing exactly what a OS is to do... I don't know. However,
your feedback would help. Please shoot holes in my thought process.

Thanks!


I'm mostly responding to your point about "not a periodic task" when
discussing GPIO delays.  None of what I'm saying should either
encourage you or discourage you from weighing all your other concerns.
But I thought I might point out one way to solve one problem to see if
that helps.

You pointed out the basic idea of having a regular timer (you
suggested a periodic one at 1ms.)  That's reasonable and you can use
it for a lot of things (adjust the rate per your minimum delay need,
of course.)  For some things you may have to deal with, you could set
up state machines attached to the timer which would get a small bit of
cpu for a moment each tick (or every so many ticks.)  An example of
something somewhat complex that works okay this way would be reading
and writing a serial EEPROM "in the background."  Normally, the state
machine would just stay in a quiescent state of "do nothing" until
some data needs to be transferred.  Then the state machine would
transition out of the quiescent state and begin the transfer, moving
from state to state at the timer rate.  (Assuming you didn't have a
hardware peripheral for all this, of course, and had to bit-bang the
transfer.)

However, you also said that the "the toggling of a pin after a certain
delay ... [is] not a periodic task."  Well, counting down the counter
can be thought of as a periodic task.  The problem then remains that
you might have a lot of counters to count down.  A solution, so that
you only have one to worry about _ever_, is to use a delta queue for
such timing and to re-insert the desired routine given the delay to
the next event.

For example, suppose you wanted to toggle a particular pin so that it
was high for 50ms and low for 150ms.  (Assume your 1ms timer event
exists and that delta queue insertion is already implemented [to be
discussed in a moment.])  The code might look like:

  void SetPinHigh( void ) {
	// some code needed to set the pin high
    insertdq( SetPinLow, 50 );
  }

  void SetPinLow( void ) {
	// some code needed to set the pin low
    insertdq( SetPinHigh, 150 );
  }

In the above case, you are responsible for queueing up the next
operation.  But that is pretty easy, really.  The number there just
specifies the number of timer ticks to wait out "from now."  If the
delta queue handler does a reasonably fast job calling code when it is
supposed to and the operation isn't itself slow [if it is, just re-
insert the process at the start of the procedure instead of the end of
it], and/or if the timer is slow by comparison, then the triggered
events will appear to be consistently accurate.

Okay.  So what is a delta queue?  Assume the queue is empty.  When you
try and insert a new entry (function pointer and time), it is inserted
immediately into the queue with the delay time as given.  The timer
then decrements that counter and, when it reaches zero, it removes the
entry from the queue and executes the function.  For the case where a
second entry is to be added before the first is executed, an example
might help:

    insertdq( P1, 150 );
    insertdq( P2, 250 );
    insertdq( P3, 100 );
    insertdq( P4, 200 );

Assume these happen back-to-back in short order, between 1ms timer
ticks, and assume the queue is empty to start.  The queue will look
like the following:

After the 1st insert:
 --> [ P1,   150 ]    ; P1 wants to wait out 150 timer events

After the 2nd insert:
 --> [ P1,   150 ]    ; P1 still wants to wait out 150 timer events
     [ P2,   100 ]    ; P2 will then wait another 100 timer events

After the 3rd insert:
 --> [ P3,   100 ]    ; P3 is to wait out 100 timer events
     [ P1,    50 ]    ; P1 will now have to wait out only 50 more
     [ P2,   100 ]    ; P2 waits yet another 100 timer events

After the 4th insert:
 --> [ P3,   100 ]    ; P3 is to wait out 100 timer events
     [ P1,    50 ]    ; P1 will wait out an addition 50 (150 total)
     [ P4,    50 ]    ; P4 will wait yet another 50 (200 total)
     [ P2,    50 ]    ; P2 will wait still another 50 (250 total)

What happens in the insert code is that the time is compared to the
current entry in the queue (starts at the first.)  If the time of the
process to be inserted is less or equal to it, then the process and
its time is inserted at that point (before the current entry) and the
time of the newly inserted process is subtracted from what was the
'current' entry, which had the greater or equal value.  If the time of
the process to be inserted is greater, though, then the time value of
the current queue entry is subtracted from the inserting entry's time,
the current entry is advanced to the next entry, and the examination
continues as already described.  Obviously, if you reach the end of
the queue, you just add the entry and its time.

If you use this method, your timer code will never have to decrement
more than one timer -- the one belonging to the queue entry at the top
of the queue.  All of the other entries are automatically decremented
by doing that, as their times listed in the queue are simply how much
"more time" is needed by them than the process earlier than they are.

If the timer event handler encounters a situation where the top entry
on the queue is decremented to zero AND where one or more entries
after it also have their times listed as zero, then the timer event
handler should call the top entry's code and when that returns then
call the next entry, etc., until all 'zero' entries have been both run
and removed from the queue.  It stops when it finds a queue entry with
a non-zero counter value.  At that point, the count down can continue
at the next timer event.

I think you can see the value, and perhaps some of the cautions in the
above method.  But it is simple enough.  If you also use something
more complicated than just toggling a port pin, you can set up quite a
few 'state machines', each hooked to the timer event with an
appropriate delay queued up for the next time they need a little cpu.
The main code for a state machine might have a static 'state' variable
used to index through an array of state machine functions, calling
each one in turn and allowing each state to return the value of the
next state to use.  This main code would then re-schedule another
execution by inserting itself after a fixed delay, if you want.

You could also consider looking up concepts such as thunks, iterators
of the CLU variety, and cooperative coroutines.  It's not hard to
write up a little bit of code to support a thread switch.

One of the nice advantages of doing the little bit of coding you need,
yourself, is that you know what you have and understand it well.
Hauling in an operating system written for general purpose use can be
a fair learning curve, which may easily compare to the work involved
in writing your own delta queue code or thread switch function.  Worth
considering.

Jon
That was an excellent example! It makes a lot of sense to me and I can't
see any immediate flaws with my intended application in mind. It also seems
'simple' enough that I can write it without investing a great deal of time.
I think I'll use that as my starting point. I am very much in agreement
with your comment on rolling your own. It's always been much more difficult
for me to inherit code or use others code than it is to make my own (within
reason of course). I'll get the benefit of learning how it works and what
it's positive and negative attributes are.

Just a few questions about implementation of such a scheme...

1) Are there any stats besides the number of items presently in the queue
that would be helpful to maintain?

2) I would envision using malloc and memcpy to implement the queue. Are
these appropriate (efficient enough) in the embedded world? I would assume
so but...

Also, any recommended books on this subject matter (keeping in mind that I
have zero experience in this realm)?

Thanks!
On Tue, 23 Dec 2008 22:31:13 -0600, "eeboy" <jason@jasonorsborn.com>
wrote:

>That was an excellent example! It makes a lot of sense to me and I can't >see any immediate flaws with my intended application in mind. It also seems >'simple' enough that I can write it without investing a great deal of time. >I think I'll use that as my starting point. I am very much in agreement >with your comment on rolling your own. It's always been much more difficult >for me to inherit code or use others code than it is to make my own (within >reason of course). I'll get the benefit of learning how it works and what >it's positive and negative attributes are.
Daring to assume you are responding to me, see replies below:
>Just a few questions about implementation of such a scheme... > >1) Are there any stats besides the number of items presently in the queue >that would be helpful to maintain?
A singly-linked list of function addresses and delay times is probably fine, given that I know nothing much more about your application.
>2) I would envision using malloc and memcpy to implement the queue. Are >these appropriate (efficient enough) in the embedded world? I would assume >so but...
My own penchant would be to entirely avoid malloc and its ilk. Instead, I would make an educated guess about how many slots you will need and allocate them as an array. More efficient, as compilers can readily convert some of the references to elements to direct memory references. And you get compile-time errors instead of run-time errors you would have no idea what to do with if you got them at that time. I'd recommend that your array/list be arranged into three semantic regions -- the head of one "eating" the tail of the next, in a sense. Every slot in the list should be in one of three conceptual queues (although they are all in the same physical array): (1) the "avail" list of slots that can be used to add a new function to be called later; (2) the "sleep" list of functions that are still waiting for their time to come; and (3) the "ready" list of functions that have timed out but have not yet been executed. The three queue-head pointers are used to reference the first entry in their particular queue. All linked entries in the chain, following the one indicated by the queue-head, belong to that queue until the first entry of the adjacent queue is seen. If two adjacent queue heads reference the same entry then the "earlier" queue-head owns an empty queue. I hope that's plain enough. But just to be certain, examine the following diagram for clarification. Sleep ---------------------------------> -> [] -> [] -> [] - / | Ready -------------> -> [] -> [] -> [] - | / | Avail ---> [] -> [] - | | ^ | |_________________________________________________| The queue-head pointers can be advanced in only one direction. For example, it is illegal to move an entry from the sleep-queue to the avail-queue by backing up the avail-queue's "head." Instead, the entry must be moved from the sleep-queue to the ready-queue first, by advancing the sleep-queue "head." The avail-queue "head" is advanced when a new entry is inserted into the sleep queue. The ready-queue "head" is advanced when a function that is ready to execute has been, in fact, executed. The sleep-queue "head" is advanced when an entry has timed-out and is now ready to be executed. A queue is empty when its "head" runs up against the following queue's "head" (as I already mentioned.) So, the avail-queue is empty when the avail-queue's "head" is equal to the ready-queue's "head". The ready-queue "follows" the avail-queue. (Kind of like the scissors paper rock game.) If you follow this, so far, it's appropriate to examine the possible "states" of the three queues to point out an important design issue. Each queue may be either empty or not empty. With three queues, this leads to eight states. The following table shows the different combinations and compares the associated "head" pointers for the queues with each other. I've used a (+) to indicate a non-empty queue and a (-) to indicate an empty one. Sleep Ready Avail Pointer conditions Description of condition -------------------------------------------------------------------- - - - A==R, R==S, S==A Meaningless + - - A==R, R==S, S==A All slots are sleeping - + - A==R, R==S, S==A All slots are ready to run - - + A==R, R==S, S==A All slots are available - + + A<>R, R<>S, S==A No slots are sleeping + - + A<>R, R==S, S<>A No slots are ready to run + + - A==R, R<>S, S<>A No slots available + + + A<>R, R<>S, S<>A No empty queues As you can see, four of the states are indistinguishable from each other if the only thing you do is compare pointers. This problem can easily be resolved by insisting that the avail-queue never be allowed to become empty (a simpler choice than keeping either of the other two queues non-empty.) With that assertion made, we can conclude that when the three queue head-pointers are equal to each other, the avail-queue is full and the other two queues are empty. I'd use at least three subroutines to manage the three queues: an insert function which moves entries from the avail-queue to the sleep-queue by advancing the head-pointer of the avail-queue; an execute function which moves entries from the ready-queue to the avail-queue by advancing the head-pointer to the ready-queue, and the "Timer" function moves entries from the sleep-queue to the ready-queue by advancing the head-pointer to the sleep-queue. These three functions have exclusive access to the head-pointer of the their respective queues and this design decision is crucial to removing the need for interrupt masking in order to protect the data structures from concurrency/re-entrancy problems.
>Also, any recommended books on this subject matter (keeping in mind that I >have zero experience in this realm)?
Hmm. I first read the term "delta queue" in a book by Douglas Comer in 1984 called "Operating System Design - The XINU Approach." Look on this web page, 2nd item down, for it: http://www.cs.purdue.edu/homes/dec/osbooks.html There are subsequent books on XINU, but none so clear and easy to read as the first one. Jon

Memfault Beyond the Launch