EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Task priorities in non strictly real-time systems

Started by pozz January 3, 2020
Il 03/01/2020 15:19, David Brown ha scritto:
> With pre-emptive scheduling, you will have to go > through your existing code and make very sure that you have locks or > synchronisation in place for any shared resources or data.
As I already wrote many times, I don't have experience with RTOS and task sync mechanism such as semaphores, locks, mutexes, message queues and so on. So I'm not able to understand when a sync is really needed. Could you point on a good simple material to study (online or book)? For example, many times I have a serial channel where soma data are received. A frame parser decodes the "wire data" in variables, accessed by other tasks. while(1) { serial_task(); // frame receiver/parser main_task(); // uses variables touched by frame parser } Supposing all the variables are of type int (i.e., they are changed atomically in serial_task()), should I need to protect them with locks, because they are used by main_task() too? I think lock isn't needed, except main_task() needs to have a coerent data values for all the variables (all variables with new values or old values).
On 5/1/20 11:58 am, pozz wrote:
> Il 03/01/2020 15:19, David Brown ha scritto: >> With pre-emptive scheduling, you will have to go >> through your existing code and make very sure that you have locks or >> synchronisation in place for any shared resources or data. > > As I already wrote many times, I don't have experience with RTOS and > task sync mechanism such as semaphores, locks, mutexes, message queues > and so on. > So I'm not able to understand when a sync is really needed. > Could you point on a good simple material to study (online or book)?
I wish I could, but it is actually a frightfully difficult subject. Basically it's the same as thread-safe programming. Only about 1% of programmers think they can do it. Of those, only about 1% actually can. It's the 0.99% that you have to worry about. At least some of them for Toyota. Don't be one of them! However, this difficulty is precisely why Rust was created. Although I haven't yet done a project in Rust, I've done enough multi-threaded work in C++ to know that the ideas in Rust are a massive leap forwards, and anyone doing this kind of work (especially professionally) owes it to their users to learn it.
> Supposing all the variables are of type int (i.e., they are changed > atomically in serial_task()), should I need to protect them with locks, > because they are used by main_task() too?
If "int" is your CPUs word size, you are using word alignment, and you don't have multiple CPUs with separate caches accessing the same RAM, you're probably ok for individual variables. However you will come unstuck if you expect assignments and reads to be performed in the same order you wrote them. A modern compiler will freely re-order things in extremely ambitious and unexpected ways, in order to keep the pipeline flowing. I cannot emphasise this enough. The compiler will do what it can to make your program do what it thinks you have asked for - which will NOT be the same as what you think you have asked for. You need to understand about basic mutex operations, preferably also semaphores, and beyond that to read and write barriers (if you want to write lock-free code). It's a big subject. Clifford Heath.
pozz <pozzugno@gmail.com> writes:
> As I already wrote many times, I don't have experience with RTOS and > task sync mechanism such as semaphores, locks, mutexes, message queues > and so on. > So I'm not able to understand when a sync is really needed. > > Could you point on a good simple material to study (online or book)?
I have found it simplest to have tasks communicate by message passing, the so-called "CSP model" (communicating sequential processes), rather than fooling around with explicit locks. With locks you have to worry about lock inversion and all kinds of other madness, and your main hope of getting it right is formal methods, like Lamport used for the Paxos algorithm. Message passing incurs some cpu overhead because of the interprocess communication and context switches, but it gets rid of a lot of ways things go wrong. If your RTOS supports message passing (look for "mailboxes" in the RTOS docs) then I'd say use them. The language most associated with CSP style is Erlang, which doesn't really fit on small embedded devices, but Erlang materials might still be a good place to learn about the style. Erlang inventor Joe Armstrong's book might be a good place to start: http://erlang.org/download/erlang-book-part1.pdf At the much lower end, you could check out Brad Rodriguez's articles about Forth multitaskers: https://www.bradrodriguez.com/papers/mtasking.html and related ones at https://www.bradrodriguez.com/papers/ .
On Sun, 5 Jan 2020 14:26:12 +1100, Clifford Heath <no.spam@please.net>
wrote:

>On 5/1/20 11:58 am, pozz wrote: >> Il 03/01/2020 15:19, David Brown ha scritto: >>> With pre-emptive scheduling, you will have to go >>> through your existing code and make very sure that you have locks or >>> synchronisation in place for any shared resources or data. >> >> As I already wrote many times, I don't have experience with RTOS and >> task sync mechanism such as semaphores, locks, mutexes, message queues >> and so on. >> So I'm not able to understand when a sync is really needed. >> Could you point on a good simple material to study (online or book)? > >I wish I could, but it is actually a frightfully difficult subject. >Basically it's the same as thread-safe programming. >Only about 1% of programmers think they can do it. >Of those, only about 1% actually can. > >It's the 0.99% that you have to worry about. At least some of them for >Toyota. Don't be one of them! > >However, this difficulty is precisely why Rust was created. Although I >haven't yet done a project in Rust, I've done enough multi-threaded work >in C++ to know that the ideas in Rust are a massive leap forwards, and >anyone doing this kind of work (especially professionally) owes it to >their users to learn it. > >> Supposing all the variables are of type int (i.e., they are changed >> atomically in serial_task()), should I need to protect them with locks, >> because they are used by main_task() too?
You may need some double buffering in one form or another. Assuming you have a receiver byte buffer that can take a full serial message and a structure of integers that will receive the values decoded from the message. When the serial task notices the end of a message. it immediately decodes the values into the integers in the struct. After this, the serial byte buffer is ready to start receiving the next message.The serial task can then inform the main task that new data is in the integer structure and main task can copy it to local variables. Alternatively, if a serial byte buffer is not used but the received bytes are decoded into the integer fields in the fly, then a copy of the struct may be provided, e.g. after the last integer has been decoded, put the complete struct into a mailbox, if the RTOS provides mailbox support. In both cases the main task has a full message transfer time to process a message, before it has to process the next serial message. If the main task is incapable of processing the messages in time, then the program is faulty at least in the hard real time sense.
> >If "int" is your CPUs word size, you are using word alignment, and you >don't have multiple CPUs with separate caches accessing the same RAM, >you're probably ok for individual variables. However you will come >unstuck if you expect assignments and reads to be performed in the same >order you wrote them. A modern compiler will freely re-order things in >extremely ambitious and unexpected ways, in order to keep the pipeline >flowing.
Using volatile declaration and turn of optimization will help. Better yet, use small assembler routines to have full control of actual memory access.
> >I cannot emphasise this enough. The compiler will do what it can to make >your program do what it thinks you have asked for - which will NOT be >the same as what you think you have asked for. > >You need to understand about basic mutex operations, preferably also >semaphores, and beyond that to read and write barriers (if you want to >write lock-free code). It's a big subject.
t least with a small micro controller, simply disable interrupts for a critical section. Of course the critical section must behave like a real interrupt, limit the number of instructions and do not call any library routines.
On 05/01/2020 04:26, Clifford Heath wrote:
> On 5/1/20 11:58 am, pozz wrote: >> Il 03/01/2020 15:19, David Brown ha scritto: >>> With pre-emptive scheduling, you will have to go >>> through your existing code and make very sure that you have locks or >>> synchronisation in place for any shared resources or data. >> >> As I already wrote many times, I don't have experience with RTOS and >> task sync mechanism such as semaphores, locks, mutexes, message queues >> and so on. >> So I'm not able to understand when a sync is really needed. >> Could you point on a good simple material to study (online or book)? > > I wish I could, but it is actually a frightfully difficult subject. > Basically it's the same as thread-safe programming. > Only about 1% of programmers think they can do it. > Of those, only about 1% actually can. > > It's the 0.99% that you have to worry about. At least some of them for > Toyota. Don't be one of them!
All good points.
> > However, this difficulty is precisely why Rust was created. Although I > haven't yet done a project in Rust, I've done enough multi-threaded work > in C++ to know that the ideas in Rust are a massive leap forwards, and > anyone doing this kind of work (especially professionally) owes it to > their users to learn it.
"Safe" languages like Rust can help for simple issues, but won't give any benefits of the more challenging cases. If you understand the basics of multi-threading, and have a good, careful development methodology, you won't have the kind of problems that Rust would help you with. Maybe Rust will help for some cases, but don't believe that it is a game-changer.
> >> Supposing all the variables are of type int (i.e., they are changed >> atomically in serial_task()), should I need to protect them with >> locks, because they are used by main_task() too? > > If "int" is your CPUs word size, you are using word alignment, and you > don't have multiple CPUs with separate caches accessing the same RAM, > you're probably ok for individual variables. However you will come > unstuck if you expect assignments and reads to be performed in the same > order you wrote them. A modern compiler will freely re-order things in > extremely ambitious and unexpected ways, in order to keep the pipeline > flowing.
Yes. Simple reads and writes of aligned data that is no bigger than the cpu's word size will be atomic without any more effort. But complex accesses (like "x++;") are not atomic on most processors. And you don't have any ordering unless you use "volatile", or memory fences of some kind. A key mistake many people make is to think that non-volatile accesses are also ordered by volatile accesses - this is, of course, untrue.
> > I cannot emphasise this enough. The compiler will do what it can to make > your program do what it thinks you have asked for - which will NOT be > the same as what you think you have asked for.
Well, it /will/ be the same as you think you asked for when you know what you are doing!
> > You need to understand about basic mutex operations, preferably also > semaphores, and beyond that to read and write barriers (if you want to > write lock-free code). It's a big subject. >
Yes. Or he can use cooperative multitasking, and avoid many of these issues!
> Clifford Heath.
On 05/01/2020 10:21, upsidedown@downunder.com wrote:
> On Sun, 5 Jan 2020 14:26:12 +1100, Clifford Heath <no.spam@please.net>
>> >> If "int" is your CPUs word size, you are using word alignment, and you >> don't have multiple CPUs with separate caches accessing the same RAM, >> you're probably ok for individual variables. However you will come >> unstuck if you expect assignments and reads to be performed in the same >> order you wrote them. A modern compiler will freely re-order things in >> extremely ambitious and unexpected ways, in order to keep the pipeline >> flowing. > > Using volatile declaration and turn of optimization will help. Better > yet, use small assembler routines to have full control of actual > memory access.
Turning off optimisation is /never/ the answer! (Baring bugging compilers, of course.) If your code "works with optimisation disabled", your code is /wrong/. In over 25 years in this business, I have never seen an exception. Remember, there is no such thing as "disabling optimisations" - compilers can re-arrange code and apply whatever transformations they like, according to the C standards, with a total disregard for your choice of optimisation settings. These settings are guidelines, not part of the semantics of the language - the language and the freedoms the compiler has do not change (unless your compiler specifically documents the changes). And even if you think it is a "workaround" that is good enough for now, you are creating a maintainability nightmare. Or worse - you are creating something that works fine during your testing and fails when deployed. "Volatile", when used correctly, can be helpful. Assembly routines for memory accesses are usually a bad idea - inefficient, inflexible and error-prone. If you want a simple and relatively fool-proof system, all you really need are two functions (preferably inline) : interrupt_status_t disableGlobalInterrupts(void); void restoreGlobalInterrupts(interrupt_status_t old_status); These must both act as full memory fences. Then you can put whatever code needs atomic behaviour within a critical section bracketed by these functions. You need more work if you have other memory masters (DMA, second processor, etc.).
> >> >> I cannot emphasise this enough. The compiler will do what it can to make >> your program do what it thinks you have asked for - which will NOT be >> the same as what you think you have asked for. >> >> You need to understand about basic mutex operations, preferably also >> semaphores, and beyond that to read and write barriers (if you want to >> write lock-free code). It's a big subject. > > t least with a small micro controller, simply disable interrupts for a > critical section. Of course the critical section must behave like a > real interrupt, limit the number of instructions and do not call any > library routines. >
On 05/01/2020 07:34, Paul Rubin wrote:
> pozz <pozzugno@gmail.com> writes: >> As I already wrote many times, I don't have experience with RTOS and >> task sync mechanism such as semaphores, locks, mutexes, message queues >> and so on. >> So I'm not able to understand when a sync is really needed. >> >> Could you point on a good simple material to study (online or book)? > > I have found it simplest to have tasks communicate by message passing, > the so-called "CSP model" (communicating sequential processes), rather > than fooling around with explicit locks. With locks you have to worry > about lock inversion and all kinds of other madness, and your main hope > of getting it right is formal methods, like Lamport used for the Paxos > algorithm. Message passing incurs some cpu overhead because of the > interprocess communication and context switches, but it gets rid of a > lot of ways things go wrong. > > If your RTOS supports message passing (look for "mailboxes" in the RTOS > docs) then I'd say use them. >
Yes - message passing (whether asynchronous with queues, or synchronous with CSP style) is often a lot easier to get right than complicated locking mechanisms.
> The language most associated with CSP style is Erlang, which doesn't > really fit on small embedded devices, but Erlang materials might still > be a good place to learn about the style. Erlang inventor Joe > Armstrong's book might be a good place to start: > > http://erlang.org/download/erlang-book-part1.pdf >
I've worked indirectly with Erlang (I made the microcontroller half of the system, in C, while someone else wrote the Linux half in Erlang). I was not impressed - he spend a lot of time figuring out things that should have been very simple. It is just one sample point, of course, and not enough to condemn a whole language - but it does mean Erlang is not high on my "languages to learn when I have time" list. Far and away the most popular "CSP language" is Go, as I understand it. Another option is XC for XMOS devices, but that is hardware-specific.
> At the much lower end, you could check out Brad Rodriguez's articles > about Forth multitaskers: > > https://www.bradrodriguez.com/papers/mtasking.html > > and related ones at https://www.bradrodriguez.com/papers/ . >
On Sun, 5 Jan 2020 14:07:37 +0100, David Brown
<david.brown@hesbynett.no> wrote:

>On 05/01/2020 10:21, upsidedown@downunder.com wrote: >> On Sun, 5 Jan 2020 14:26:12 +1100, Clifford Heath <no.spam@please.net> > >>> >>> If "int" is your CPUs word size, you are using word alignment, and you >>> don't have multiple CPUs with separate caches accessing the same RAM, >>> you're probably ok for individual variables. However you will come >>> unstuck if you expect assignments and reads to be performed in the same >>> order you wrote them. A modern compiler will freely re-order things in >>> extremely ambitious and unexpected ways, in order to keep the pipeline >>> flowing. >> >> Using volatile declaration and turn of optimization will help. Better >> yet, use small assembler routines to have full control of actual >> memory access. > >Turning off optimisation is /never/ the answer! (Baring bugging >compilers, of course.) > >If your code "works with optimisation disabled", your code is /wrong/. >In over 25 years in this business, I have never seen an exception. > >Remember, there is no such thing as "disabling optimisations" - >compilers can re-arrange code and apply whatever transformations they >like, according to the C standards, with a total disregard for your >choice of optimisation settings. These settings are guidelines, not >part of the semantics of the language - the language and the freedoms >the compiler has do not change (unless your compiler specifically >documents the changes).
The problem is the C standard or actually the language lawyers (in most languages) who do not have understanding for multithreading or multiprocessors.
>And even if you think it is a "workaround" that is good enough for now, >you are creating a maintainability nightmare. Or worse - you are >creating something that works fine during your testing and fails when >deployed. > >"Volatile", when used correctly, can be helpful. > >Assembly routines for memory accesses are usually a bad idea - >inefficient, inflexible and error-prone.
In any hardware platforms with at least memory location increment or decrement operation as on a single instruction performing read/modify/write memory access cycle is often quite hardly. Even if you can't get an atomic R/M/W cycle, there are often similar tricks e.g. such as using the lock prefix in x86
>If you want a simple and relatively fool-proof system, all you really >need are two functions (preferably inline) : > >interrupt_status_t disableGlobalInterrupts(void); >void restoreGlobalInterrupts(interrupt_status_t old_status); > >These must both act as full memory fences. > >Then you can put whatever code needs atomic behaviour within a critical >section bracketed by these functions.
Is this standard C in some recent standard variant ?
>You need more work if you have other memory masters (DMA, second >processor, etc.).
This has a lot to do with cache coherence.
On 2020-01-05 15:12, David Brown wrote:
> On 05/01/2020 07:34, Paul Rubin wrote: >> pozz <pozzugno@gmail.com> writes: >>> As I already wrote many times, I don't have experience with RTOS and >>> task sync mechanism such as semaphores, locks, mutexes, message queues >>> and so on. >>> So I'm not able to understand when a sync is really needed. >>> >>> Could you point on a good simple material to study (online or book)? >> >> I have found it simplest to have tasks communicate by message passing, >> the so-called "CSP model" (communicating sequential processes), rather >> than fooling around with explicit locks.
[snip]
> Yes - message passing (whether asynchronous with queues, or synchronous > with CSP style) is often a lot easier to get right than complicated > locking mechanisms. > >> The language most associated with CSP style is Erlang, which doesn't >> really fit on small embedded devices, but Erlang materials might still >> be a good place to learn about the style.&nbsp; Erlang inventor Joe >> Armstrong's book might be a good place to start: >> >> &nbsp;&nbsp;&nbsp; http://erlang.org/download/erlang-book-part1.pdf
[snip]
> Far and away the most popular "CSP language" is Go, as I understand it. > Another option is XC for XMOS devices, but that is hardware-specific.
Another language with CSP-style primitives is Ada (the "rendez-vous" feature), although AIUI most embedded Ada programs currently being implemented use the alternative "monitor"-like primitives (the "protected object" feature), which can be used to implement critical regions, or CSP-like message passing, or buffered (queued) message passing, or for many other styles. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On 05/01/2020 16:46, upsidedown@downunder.com wrote:
> On Sun, 5 Jan 2020 14:07:37 +0100, David Brown > <david.brown@hesbynett.no> wrote: > >> On 05/01/2020 10:21, upsidedown@downunder.com wrote: >>> On Sun, 5 Jan 2020 14:26:12 +1100, Clifford Heath <no.spam@please.net> >> >>>> >>>> If "int" is your CPUs word size, you are using word alignment, and you >>>> don't have multiple CPUs with separate caches accessing the same RAM, >>>> you're probably ok for individual variables. However you will come >>>> unstuck if you expect assignments and reads to be performed in the same >>>> order you wrote them. A modern compiler will freely re-order things in >>>> extremely ambitious and unexpected ways, in order to keep the pipeline >>>> flowing. >>> >>> Using volatile declaration and turn of optimization will help. Better >>> yet, use small assembler routines to have full control of actual >>> memory access. >> >> Turning off optimisation is /never/ the answer! (Baring bugging >> compilers, of course.) >> >> If your code "works with optimisation disabled", your code is /wrong/. >> In over 25 years in this business, I have never seen an exception. >> >> Remember, there is no such thing as "disabling optimisations" - >> compilers can re-arrange code and apply whatever transformations they >> like, according to the C standards, with a total disregard for your >> choice of optimisation settings. These settings are guidelines, not >> part of the semantics of the language - the language and the freedoms >> the compiler has do not change (unless your compiler specifically >> documents the changes). > > The problem is the C standard or actually the language lawyers (in > most languages) who do not have understanding for multithreading or > multiprocessors.
The main point of C11 is support for multi-threading and multi-processor systems. The standards, and the language lawyers, /do/ understand it. The more advanced and progressive compilers support C11. Many embedded ones do not, but that is the fault of the compiler vendors, and perhaps of developers who don't realise that they should be insisting on it. The big missing feature, however, is that you need an implementation of some of the functions in the C11 threading libraries, and the implementation must fit the OS in use. That's not too hard for Linux or Windows, but a different world in embedded systems. Still, it should be possible to make C11 library support for FreeRTOS, mbed, and any other RTOS you like. Key points like atomics, fences, and language semantics for multi-threading are in place. (And C++ is more helpful in providing higher level multi-threading features.) So we are far from having nice multi-threading integration in C toolchains, but nearly as far as you suggest.
> >> And even if you think it is a "workaround" that is good enough for now, >> you are creating a maintainability nightmare. Or worse - you are >> creating something that works fine during your testing and fails when >> deployed. >> >> "Volatile", when used correctly, can be helpful. >> >> Assembly routines for memory accesses are usually a bad idea - >> inefficient, inflexible and error-prone. > > In any hardware platforms with at least memory location increment or > decrement operation as on a single instruction performing > read/modify/write memory access cycle is often quite hardly. >
/Some/ hardware platforms that let you do "x++" as a single instruction on memory, do so atomically. Many others do not. Typically, they are atomic on small 8-bit CISC microcontrollers. On larger processors, you rarely get such instructions at all (they don't exist on any kind of RISC cpu). And even when you /do/ get them, they may be implemented by multiple separate actions. Perhaps they are atomic with respect to other code on the same core (such as interrupts), but not with respect to DMA or other cores. So this kind of thing can sometimes be acceptable on target-specific code for small microcontrollers, but not otherwise.
> Even if you can't get an atomic R/M/W cycle, there are often similar > tricks e.g. such as using the lock prefix in x86 >
You do that using intrinsics or C11/C++11 atomics. You certainly don't do it with "volatile".
> >> If you want a simple and relatively fool-proof system, all you really >> need are two functions (preferably inline) : >> >> interrupt_status_t disableGlobalInterrupts(void); >> void restoreGlobalInterrupts(interrupt_status_t old_status); >> >> These must both act as full memory fences. >> >> Then you can put whatever code needs atomic behaviour within a critical >> section bracketed by these functions. > > Is this standard C in some recent standard variant ?
No C standard covers interrupts, or ways to disable and enable them - that is highly target-specific. For example, on the ARM Cortex-M, you might use: #include "core_cmFunc.h" typedef interrupt_status_t uint32_t; static inline interrupt_status_t disableGlobalInterrupts(void) { interrupt_status_t old = __get_PRIMASK(); __disable_irq(); return old; } static inline void restoreGlobalInterrupts(interrupt_status_t old) { __set_PRIMASK(old); } If you don't want to use the ARM core functions, you can use inline assembly - but that is compiler specific. For gcc, that would be: typedef interrupt_status_t uint32_t; static inline interrupt_status_t disableGlobalInterrupts(void) { interrupt_status_t old; asm volatile ("mrs %0, primask" : "=r" (old)); asm volatile ("cpsid i" : : : "memory"); return old; } static inline void restoreGlobalInterrupts(interrupt_status_t old) { asm volatile ("msr primask, %0" : : "r" (old) : "memory"); } C11 provides standard support for a memory barrier, but since you need compiler-specific code for the implementation anyway, you might as well use the compiler-specific memory barrier.
> > >> You need more work if you have other memory masters (DMA, second >> processor, etc.). > > This has a lot to do with cache coherence. >
That is one aspect, yes. But it is not the only one. For example, bigger processors can have write buffers with re-ordering.

Memfault Beyond the Launch