On 2020-01-17 3:05, pozz wrote:
> Il 16/01/2020 21:24, Niklas Holsti ha scritto:
>> On 2020-01-16 15:48, pozz wrote:
>>> Il 15/01/2020 17:53, Niklas Holsti ha scritto:
>> &#4294967295;&#4294967295;&#4294967295; ...
>>>> The approach where all real-time "tasks" are interrupts is called 
>>>> "foreground-background", I believe. It was popular in the 
>>>> microcomputer era, before real-time kernels became popular -- it 
>>>> gave most of the same benefits (pre-emption) but scheduled by HW 
>>>> (interrupt priorities).
>>>
>>> Yes "foreground-background" is another name, but it says nothing 
>>> about the blocking/non-blocking of background (non interrupts) tasks.
>>> If you have blocking (long running) tasks, you will not have a 
>>> reactive system (only interrupts will be reactive).
>>
>> Reactivity was the very point of the architecture - anything that 
>> needed quick response executed at interrupt level. The background was 
>> just a sequential process that handled all the non-time-critical 
>> things (often stuff delegated from the interrupt levels).
>>
>> Those machines usually had prioritized interrupt systems where a 
>> higher-priority interrupt could interrupt (and thus pre-empt) a 
>> running lower-priority interrupt handler. So the systems were 
>> definitely reactive.
> 
> I was using reactive word with another meaning.
> 
> Real-time response is critical and must be managed by interrupts (that 
> can preempt background tasks). If a char is received from a UART with a 
> single-byte FIFO, you *need* to move it in RAM as soon as possible, to 
> avoid discarding the next byte that could arrive immediately after.
> 
> Reactivity is much less critical. You press a button on a keyboard and 
> you expect to see that character on the display as soon as possible. It 
> can appear after 10ms and you will be very happy (wow, this PC is very 
> reactive), but can appear, maybe a few times, after 500ms and you will 
> be less happy (shit, how slow is this PC?). In both cases, the system 
> does its job.

So you are making a difference between hard and soft deadlines. Ok. And 
certainly the "background" task in a foreground-background design can be 
designed as a master-loop (or major/minor cycle loop) to give shorter 
response times for this task, resulting in a hybrid design.

I would not call use the term "reactivity" to denote this difference, 
but I suppose you can if you want to.

> Of course, I know there are some extreme cases where you have 100 tasks 
> (I admit it's impossible to maintain a good general reactivity in this 
> case, even if every single task is fast) or when a task is very CPU 
> intensive (FFT, encryption, ...).

No, not "impossible". It depends on your quantitative definition of 
"good", and on whether you have a single master loop or a major/minor 
hierarchical loop.

An example application is on its way to orbit Mercury, on-board the 
Bepi-Colombo spacecraft: the on-board SW for the the MIXS/SIXS X-ray 
instruments. My colleagues who implemented this SW wanted to avoid a 
real-time, multi-tasking kernel (for a reason never really explained to 
me) but they had real-time deadlines to meet. They decided on a 
one-second major cycle and a one-millisecond time quantum. They made a 
fixed schedule saying exactly what the SW should execute in each of the 
1000 milliseconds in each second. Some jobs are allocated more than one 
millisecond, so I think they have about 300 "jobs" to execute in 
sequence in each second. The schedule allocates several millisecond 
slots, evenly spaced, for jobs with deadlines shorter than one second. 
(I forget what the shortest deadline is, but they also use interrupts 
for the very short ones.) The SW works, of course.

> I think those are examples of a few applications where preemptive 
> multitasking approach is the right answer to have a good reactivity.

Almost all my applications are in that class, so I see them as more than 
a few.

> But in many many other applications you can guarantee real-time (through 
> interruts) and reactivity (through non-blocking background tasks) with a 
> simple superloop cooperative approach, without multitask issues (...and 
> critical bugs) but with an additional complexity (state-machine).

You still have "multitask issues" between the interrupt handlers and the 
background tasks.

But I do agree that some applications can be implemented as master-loop 
or major/minor-cycle loops without excessive complication.

In the MIXS/SIXS example I mentioned above, the designers had to slice 
several SW loops, which would have taken too long if executed in one job 
from start to end, into pieces executed in several jobs, with some 
variables keeping track of how far the loop has progressed.

For myself, I would be more worried about errors in such algorithmic 
distortions than about bugs from multi-task issues.

>>> If the tasks are coded as non-blocking (maybe with the help of 
>>> state-machines), you will be able to have a reactive system even for 
>>> the non-interrupts tasks.
>>
>> Not needed in a foreground-background system.
> 
> What is not needed? Reactivity?

A concern for real-time responsiveness of the background part.

Perhaps this is a matter of definition. I would call something a 
foreground-background system only if it implements _all_ its real-time 
deadlines (whether hard or soft) in interrupt handlers, and the 
background task only has to provide enough average throughput to avoid a 
build-up and overflow of background jobs.

>> Note also that even if the CRC computation is fast enough not to 
>> endanger other deadlines, if you make it one "task" in a superloop 
>> with 100 other "tasks", it will be called only once per 100 "task" 
>> executions (once per superloop iteration) which will limit the rate of 
>> packet flow, which may violate a hard performance requirement.
> 
> Again you show examples that appears to me at minimum "extreme", not 
> "typical". How many times you write applications with 100 tasks?

30 tasks is common for me. If I would have to divide them into 
"non-blocking" operations to be called from a master loop, and limit 
myself to very simple interrupt handlers, it could grow to more.

> [OT]
> Anyway I don't know how much space those little monsters have in the 
> market. Should I design a board with a lot of fast interfaces and the 
> cost or power consumption weren't so important, most probably I would 
> choose an embedded Linux SOM with a lot of Flash and RAM. And maybe I 
> will choose Python as the language that speed up a lot the development.

There are people in my domain (space SW) who take that approach, too. 
But it has drawbacks...

I was recently involved in a microsatellite design where the system 
designers at first intended to have two Linux-based, two-core, 
high-speed boards with nearly 100 MB of RAM, one for application (image) 
processing, and the other for system control functions. But when they 
added up their electrical power needs they found a problem... and the 
system-control computer was reduced to a Cortex microcontroller with 256 
KB of RAM. But it still had to run a CAN-bus and two UART interfaces in 
parallel, as well as execute a time-tagged schedule, as well as monitor 
its own health and other data, etc. Python would not do, I think.

> [/OT]

    ...

> Maybe multitasking issues aren't too complex... issues, after acquiring 
> some experience and knowledge. At the moment they appear to me very 
> complex.

I would say, if you have mastered writing interrupt handlers, and the 
interactions between interrupt handlers and background tasks, you will 
surely be able to master pre-emptive tasking, where one can use 
high-level primitives (message-passing, rendez-vous, ...) which are 
often not allowed at interrupt levels.

> I have an example of an IoT board based on a Cortex-M3 MCU by NXP. The 
> MCU communicates with a remote device through a RS485 bus, and with a 
> Cloud system (actually AWS) through Ethernet.
> In this application I have three main foreground ISR-based drivers 
> (timer, Ethernet and UART) and some background tasks: lwip TCP/IP stack, 
> RS485 protocol parser, LEDs and some other minors.
> 
> The real-time requirements are only for communication links: we 
> shouldn't loose Ethernet frames nor RS485 characters.
> When some events are communicated over RS485, a message is published on 
> the Cloud as soon as possible.
> 
> This system is implemented with a superloop cooperative approach and I 
> can say the reactivity is good.

Good, then. But it seems you do not quantify "good".

> The Internet protocol is MQTTs, so the security layer mbedTLS runs 
> together with lwip. It hasn't been a simple job, because mbedTLS needed 
> a lot of RAM during TLS session handshake.
> lwip can work in "no OS" systems by calling sys_check_timeouts() 
> function regularly... and I simply call it in my superloop.

I haven't programmed such devices, or any Internet-connected devices on 
those levels, so I can't comment. My experience is limited to the 
"socket" level.

> I admit mbedTLS is very slow during TLS session startup, because of 
> asymmetric encryption (the MCU doesn't have hw crypto engine). During 
> this period reactivity isn't good.
> I don't loose chars from RS485 (thanks to interrupts), but a timeout 
> expires on the RS485 remote device that thinks the Ethernet board is 
> dead. Luckily this happens only during TLS startup and the problem can 
> be filtered in a simple way.
> 
> Recently I have seen that mbedTLS added a feature to split CPU intensive 
> calculations in smaller steps, exactly to avoid blocking in cooperative 
> systems[*]. I think they implent somewhat similar to a state-machine.

With all the security bugs and problems that have been and are being 
found in network protocols, it seems wrong to add such unnecessary 
complications in the algorithms and usage.

> Just to say that this isn't so illogical approach.

It is often used, sure, but I always wonder why, given that free, small, 
efficient, real-time, pre-emptive kernels are so easily available today. 
I can understand it for systems that are really squeezed for RAM or 
flash space, but not otherwise.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Il 16/01/2020 21:24, Niklas Holsti ha scritto:
> On 2020-01-16 15:48, pozz wrote:
>> Il 15/01/2020 17:53, Niklas Holsti ha scritto:
>  &#4294967295;&#4294967295; ...
>>> The approach where all real-time "tasks" are interrupts is called 
>>> "foreground-background", I believe. It was popular in the 
>>> microcomputer era, before real-time kernels became popular -- it gave 
>>> most of the same benefits (pre-emption) but scheduled by HW 
>>> (interrupt priorities).
>>
>> Yes "foreground-background" is another name, but it says nothing about 
>> the blocking/non-blocking of background (non interrupts) tasks.
>> If you have blocking (long running) tasks, you will not have a 
>> reactive system (only interrupts will be reactive).
> 
> Reactivity was the very point of the architecture - anything that needed 
> quick response executed at interrupt level. The background was just a 
> sequential process that handled all the non-time-critical things (often 
> stuff delegated from the interrupt levels).
> 
> Those machines usually had prioritized interrupt systems where a 
> higher-priority interrupt could interrupt (and thus pre-empt) a running 
> lower-priority interrupt handler. So the systems were definitely reactive.

I was using reactive word with another meaning.

Real-time response is critical and must be managed by interrupts (that 
can preempt background tasks). If a char is received from a UART with a 
single-byte FIFO, you *need* to move it in RAM as soon as possible, to 
avoid discarding the next byte that could arrive immediately after.

Reactivity is much less critical. You press a button on a keyboard and 
you expect to see that character on the display as soon as possible. It 
can appear after 10ms and you will be very happy (wow, this PC is very 
reactive), but can appear, maybe a few times, after 500ms and you will 
be less happy (shit, how slow is this PC?). In both cases, the system 
does its job.

I usually manage reactivity of the system in non-interrupt background 
tasks. In order to have high reactivity, I need to maintain the 
superloop duration short (of course, if the tasks are maximum 10), so my 
first rule when implementing a task: it mustn't block waiting for I/O or 
mustn't spend too much time in calculations.
If this happens, I try to transform the task in state-machine and I can 
say the reactivity of the system is normally very good.

Of course, I know there are some extreme cases where you have 100 tasks 
(I admit it's impossible to maintain a good general reactivity in this 
case, even if every single task is fast) or when a task is very CPU 
intensive (FFT, encryption, ...).
I think those are examples of a few applications where preemptive 
multitasking approach is the right answer to have a good reactivity.

But in many many other applications you can guarantee real-time (through 
interruts) and reactivity (through non-blocking background tasks) with a 
simple superloop cooperative approach, without multitask issues (...and 
critical bugs) but with an additional complexity (state-machine).

>> If the tasks are coded as non-blocking (maybe with the help of 
>> state-machines), you will be able to have a reactive system even for 
>> the non-interrupts tasks.
> 
> Not needed in a foreground-background system.

What is not needed? Reactivity?

>>>> [pozz:] Interrupts are always present, even in superloop.
>>>
>>> Again a design choice. Some critical systems don't want *any* 
>>> pre-emption, even by interrupts -- simplicity reigns. ...
>>
>> I don't think this is an example that can be considered "typical" and 
>> I know there are "extreme cases" where a singular approach is needed.
> 
> Ok, but you said "always". 

My shame... often.

> I remember other examples of systems not 
> using interrupts -- usually small, quick-and-dirty-but-good-enough 
> applications.
> 
>>>> [pozz:] Maybe I am wrong, but I implement ISRs with great care
>>>> and attention,
>>>
>>> Of course you are right to use great care and attention, and 
>>> especially in ISRs.
>>>
>>>> but they are very small and limited.
>>>
>>> Many tasks in pre-emptive systems are also small and limited -- for 
>>> example, one of my designs has a task whose main loop just takes a 
>>> data-packet from an input queue, computes its CRC and appends it to 
>>> the packet, and puts the updated packet in an output queue. A handful 
>>> of source lines. ... 
>> I think you could write that task without a state-machine in superloop 
>> architecture:
> 
> If the result is fast enough, compared to the shortest deadline, of 
> course you can. But you can't be sure it will be fast enough if you 
> don't know (a) the maximum length of the packets (b) the speed of the 
> processor (c) the shortest deadline that would be at risk.
> 
> Note also that even if the CRC computation is fast enough not to 
> endanger other deadlines, if you make it one "task" in a superloop with 
> 100 other "tasks", it will be called only once per 100 "task" executions 
> (once per superloop iteration) which will limit the rate of packet flow, 
> which may violate a hard performance requirement.

Again you show examples that appears to me at minimum "extreme", not 
"typical". How many times you write applications with 100 tasks?

Ok, maybe it's better to say that I usually work on MCUs with internal 
Flash and RAM, just to say Cortex-Mx. My experience is with MCUs that 
run at maximum 100MHz and manage Ethernet or TFT, not together.

Now those MCUs are little monsters that can manage at the same time 
Ethernet, SD cards with FAT filesystems, USB, TFT display and so on. In 
those complex applications, I admit a preemptive approach could be a 
better choice.

[OT]
Anyway I don't know how much space those little monsters have in the 
market. Should I design a board with a lot of fast interfaces and the 
cost or power consumption weren't so important, most probably I would 
choose an embedded Linux SOM with a lot of Flash and RAM. And maybe I 
will choose Python as the language that speed up a lot the development.
[/OT]

> To fix that problem, you would have to go to a "major cycle - minor 
> cycle" design in which the superloop is the "major" cycle, but consists 
> of some number of "minor" cycles each of which calls the CRC computation 
> "task", to ensure that it is called often enough. This was the standard 
> approach to real-time systems before pre-emption became acceptable.
> 
> I only gave the CRC task as an example of a "task" that is as short as 
> or shorter than typical interrupt handlers. But perhaps that was going 
> off-topic.
> 
> (You may ask, why was that CRC calculation in a task of its own, and not 
> just in a subroutine called from other tasks? Because the processor had 
> a HW CRC unit, which the task used, and the CRC task serialized accesses 
> to that HW unit. So the task in fact blocked, waiting for the HW unit to 
> compute the CRC.)
> 
>> And here you don't have to think about multithreading issues (race 
>> condition, sharing resources, ...).
> 
> The multithreading issues in the CRC task are solved by the queue 
> data-structure, which is "task-safe" and used all around the system. No 
> specific multithreading problems for the CRC task. Five minutes to 
> design, 15 to implement.

Maybe multitasking issues aren't too complex... issues, after acquiring 
some experience and knowledge. At the moment they appear to me very complex.

>>>> [pozz:] Normally the biggest part of the firmware is related to the 
>>>> application/tasks, not interrupts.
>>>
>>> Except in the aforementioned foreground-background designs.
>>
>> Even in foreground-background ISRs are limited to critical real-time 
>> events.
> 
> You still don't get the point. In a foreground-background system, the 
> "interrupt service routines" (the foreground) *are* the tasks and 
> implement all of the main system functions. The background process is 
> used only for the "left-overs" -- say, memory scrubbing in a space-based 
> system. Of course there may be a considerable amount of "left-overs" if 
> deadlines are tight and some non-urgent but interrupt-triggered 
> activities can't fit in the foreground processing.

What I mean with "foreground-background" system is another thing. A few 
critical real-time requirements are managed in foreground interrupts 
routines, the biggest part of the application, that should be reactive 
for the user, are managed by some background tasks.

I have an example of an IoT board based on a Cortex-M3 MCU by NXP. The 
MCU communicates with a remote device through a RS485 bus, and with a 
Cloud system (actually AWS) through Ethernet.
In this application I have three main foreground ISR-based drivers 
(timer, Ethernet and UART) and some background tasks: lwip TCP/IP stack, 
RS485 protocol parser, LEDs and some other minors.

The real-time requirements are only for communication links: we 
shouldn't loose Ethernet frames nor RS485 characters.
When some events are communicated over RS485, a message is published on 
the Cloud as soon as possible.

This system is implemented with a superloop cooperative approach and I 
can say the reactivity is good.

The Internet protocol is MQTTs, so the security layer mbedTLS runs 
together with lwip. It hasn't been a simple job, because mbedTLS needed 
a lot of RAM during TLS session handshake.
lwip can work in "no OS" systems by calling sys_check_timeouts() 
function regularly... and I simply call it in my superloop.

MQTT client state-machine is not so complex, there are only a few main 
states:
- waiting for DNS replay (to resolve server domain address)
- waiting for MQTT connection ack from the server
- MQTT connected (where the message are published)

I admit mbedTLS is very slow during TLS session startup, because of 
asymmetric encryption (the MCU doesn't have hw crypto engine). During 
this period reactivity isn't good.
I don't loose chars from RS485 (thanks to interrupts), but a timeout 
expires on the RS485 remote device that thinks the Ethernet board is 
dead. Luckily this happens only during TLS startup and the problem can 
be filtered in a simple way.

Recently I have seen that mbedTLS added a feature to split CPU intensive 
calculations in smaller steps, exactly to avoid blocking in cooperative 
systems[*]. I think they implent somewhat similar to a state-machine.

Just to say that this isn't so illogical approach.

[*]https://tls.mbed.org/kb/development/restartable-ecc

On 2020-01-16 15:48, pozz wrote:
> Il 15/01/2020 17:53, Niklas Holsti ha scritto:
    ...
>> The approach where all real-time "tasks" are interrupts is called 
>> "foreground-background", I believe. It was popular in the 
>> microcomputer era, before real-time kernels became popular -- it gave 
>> most of the same benefits (pre-emption) but scheduled by HW (interrupt 
>> priorities).
> 
> Yes "foreground-background" is another name, but it says nothing about 
> the blocking/non-blocking of background (non interrupts) tasks.
> If you have blocking (long running) tasks, you will not have a reactive 
> system (only interrupts will be reactive).

Reactivity was the very point of the architecture - anything that needed 
quick response executed at interrupt level. The background was just a 
sequential process that handled all the non-time-critical things (often 
stuff delegated from the interrupt levels).

Those machines usually had prioritized interrupt systems where a 
higher-priority interrupt could interrupt (and thus pre-empt) a running 
lower-priority interrupt handler. So the systems were definitely reactive.

> If the tasks are coded as non-blocking (maybe with the help of 
> state-machines), you will be able to have a reactive system even for the 
> non-interrupts tasks.

Not needed in a foreground-background system.

>>> [pozz:] Interrupts are always present, even in superloop.
>>
>> Again a design choice. Some critical systems don't want *any* 
>> pre-emption, even by interrupts -- simplicity reigns. ...
> 
> I don't think this is an example that can be considered "typical" and I 
> know there are "extreme cases" where a singular approach is needed.

Ok, but you said "always". I remember other examples of systems not 
using interrupts -- usually small, quick-and-dirty-but-good-enough 
applications.

>>> [pozz:] Maybe I am wrong, but I implement ISRs with great care
>>> and attention,
>>
>> Of course you are right to use great care and attention, and 
>> especially in ISRs.
>>
>>> but they are very small and limited.
>>
>> Many tasks in pre-emptive systems are also small and limited -- for 
>> example, one of my designs has a task whose main loop just takes a 
>> data-packet from an input queue, computes its CRC and appends it to 
>> the packet, and puts the updated packet in an output queue. A handful 
>> of source lines. ...  
> I think you could write that task 
> without a state-machine in superloop architecture:

If the result is fast enough, compared to the shortest deadline, of 
course you can. But you can't be sure it will be fast enough if you 
don't know (a) the maximum length of the packets (b) the speed of the 
processor (c) the shortest deadline that would be at risk.

Note also that even if the CRC computation is fast enough not to 
endanger other deadlines, if you make it one "task" in a superloop with 
100 other "tasks", it will be called only once per 100 "task" executions 
(once per superloop iteration) which will limit the rate of packet flow, 
which may violate a hard performance requirement.

To fix that problem, you would have to go to a "major cycle - minor 
cycle" design in which the superloop is the "major" cycle, but consists 
of some number of "minor" cycles each of which calls the CRC computation 
"task", to ensure that it is called often enough. This was the standard 
approach to real-time systems before pre-emption became acceptable.

I only gave the CRC task as an example of a "task" that is as short as 
or shorter than typical interrupt handlers. But perhaps that was going 
off-topic.

(You may ask, why was that CRC calculation in a task of its own, and not 
just in a subroutine called from other tasks? Because the processor had 
a HW CRC unit, which the task used, and the CRC task serialized accesses 
to that HW unit. So the task in fact blocked, waiting for the HW unit to 
compute the CRC.)

> And here you don't have to think about multithreading issues (race 
> condition, sharing resources, ...).

The multithreading issues in the CRC task are solved by the queue 
data-structure, which is "task-safe" and used all around the system. No 
specific multithreading problems for the CRC task. Five minutes to 
design, 15 to implement.

>>> [pozz:] Normally the biggest part of the firmware is related to the 
>>> application/tasks, not interrupts.
>>
>> Except in the aforementioned foreground-background designs.
> 
> Even in foreground-background ISRs are limited to critical real-time 
> events.

You still don't get the point. In a foreground-background system, the 
"interrupt service routines" (the foreground) *are* the tasks and 
implement all of the main system functions. The background process is 
used only for the "left-overs" -- say, memory scrubbing in a space-based 
system. Of course there may be a considerable amount of "left-overs" if 
deadlines are tight and some non-urgent but interrupt-triggered 
activities can't fit in the foreground processing.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

On Thu, 16 Jan 2020 14:48:34 +0100, pozz <pozzugno@gmail.com> wrote:

>
>I don't think the CRC computation takes too long to split it in smaller 
>and faster steps (state-machine). I think you could write that task 
>without a state-machine in superloop architecture:
>
>void task_CRC(void) {
>   uint8_t *packet;
>   size_t packet_len;
>   int err = queue_pop(&packet, &packet_len);
>   if (!err) {
>     packet[packet_len] = crc(packet, packet_len);
>     packet_len++;
>     queue_push(packet, packet_len);
>   }
>}

What is the problem ?

In the character receiving ISR calculate the partial CRC/checksum for
each received character. At the end of the message, you do need to
process the last received character to check if the CRC matches.

In fact I have done a Modbus RTU line monitor for a hardware without
accurate timing information. 

After receiving a byte calculate a partial CRC and compare it with the
last two bytes. If the check matches, Assume it was he end of frame.
Then perform a sanity check that the frame is longer than the minimum
length and that it makes sense according to the protocol
specification. If these additional tests fail, assume that this was a
premature CRC. Continue reading bytes until a new CRC match is
obtained. Perform the same sanity test.

If the received frame is longer than maximum valid frame, move the
start position one byte ahead and perform a long CRC calculation.

There are surprisingly lot of false CRC triggerings, but
resynchronisation is usually quite quickly obtained.

Il 15/01/2020 17:53, Niklas Holsti ha scritto:
> On 2020-01-10 1:32, pozz wrote:
>> Il 09/01/2020 13:03, Niklas Holsti ha scritto:
>>> On 2020-01-09 13:19, pozz wrote:
>>>> Il 08/01/2020 23:54, Niklas Holsti ha scritto:
>>>>> On 2020-01-08 1:02, pozz wrote:
>>>>>> Il 07/01/2020 08:38, Niklas Holsti ha scritto:
>>>
>>> &#4294967295;&#4294967295;&#4294967295; [snip]
>>>
>>>>>>> Let's assume these requirements and properties of the environment:
>>>>>>>
>>>>>>> A. The function "serial_rx" polls the one-character reception 
>>>>>>> buffer of the serial line once, and returns the received 
>>>>>>> character, if any, and EOF otherwise. It must be called at least 
>>>>>>> as often as characters arrive (that is, depending on baud rate) 
>>>>>>> to avoid overrun and loss of some characters.
>>>>>
>>>>> You asked about possible advantages of pre-emption; I made my 
>>>>> assumptions, above, such that the (incomplete) example you gave 
>>>>> shows this advantage, under these assumptions (which could be true 
>>>>> for other, otherwise similar example applications).
>>>>>
>>>>>> No, serial driver works in interrupt mode and already use a FIFO 
>>>>>> buffer, sufficiently big. serial_rx() pop a single element from 
>>>>>> the FIFO, if any.
>>>>>
>>>>> Ah, then your *system* is intrinsically pre-emptive (the interrupts 
>>>>> pre-empt the tasks), even if the *code you showed* does not show 
>>>>> this pre-emption.
>>>>
>>>> Ah yes, interrupts are preemptive and I use them a lot, but they are 
>>>> confined in their works, they are very lightweight and fast.
>>>
>>> That's a design decision. Some systems do most of their work in 
>>> interrupt handlers, and use "background" processing only for some 
>>> non-critical housekeeping tasks.
>>>
>>
>> Yes, I imagine there are a pletora of possibilites. Anyway I thought 
>> there was two typical approaches: superloop that continuously calls 
>> non-blocking functions (an example of cooperative scheduler) and a 
>> full preemptive scheduler (most of the time a full RTOS).
> 
> The approach where all real-time "tasks" are interrupts is called 
> "foreground-background", I believe. It was popular in the microcomputer 
> era, before real-time kernels became popular -- it gave most of the same 
> benefits (pre-emption) but scheduled by HW (interrupt priorities).

Yes "foreground-background" is another name, but it says nothing about 
the blocking/non-blocking of background (non interrupts) tasks.
If you have blocking (long running) tasks, you will not have a reactive 
system (only interrupts will be reactive).
If the tasks are coded as non-blocking (maybe with the help of 
state-machines), you will be able to have a reactive system even for the 
non-interrupts tasks.

>> Interrupts are always present, even in superloop.
> 
> Again a design choice. Some critical systems don't want *any* 
> pre-emption, even by interrupts -- simplicity reigns. So they poll, and 
> accept the performance hit. For example, some colleagues of mine 
> recently completed the "recovery SW" for ESA's ExoMars rover, which 
> takes over if the "real" SW crashes and is intended to be super-safe and 
> just support problem analysis and recovery. Using any kind of real-time 
> kernel was forbidden -- only sequential coding was allowed. Any new, 
> nice SW "feature" that my colleagues suggested, for example to speed up 
> communication with Earth, was denied by the customer because it would 
> complicate the SW a little bit -- perhaps one more conditional 
> statement. No-no.

I don't think this is an example that can be considered "typical" and I 
know there are "extreme cases" where a singular approach is needed.


>> Maybe I am wrong, but I implement ISRs with great care and attention,
> 
> Of course you are right to use great care and attention, and especially 
> in ISRs.
> 
>> but they are very small and limited.
> 
> Many tasks in pre-emptive systems are also small and limited -- for 
> example, one of my designs has a task whose main loop just takes a 
> data-packet from an input queue, computes its CRC and appends it to the 
> packet, and puts the updated packet in an output queue. A handful of 
> source lines.
> 
> I would say that having a pre-emptive system relieves the designer of 
> the great care and attention needed to slice computations into 
> (state-mnachine-driven) "tasks" that execute their computations little 
> by little.

I don't think the CRC computation takes too long to split it in smaller 
and faster steps (state-machine). I think you could write that task 
without a state-machine in superloop architecture:

void task_CRC(void) {
   uint8_t *packet;
   size_t packet_len;
   int err = queue_pop(&packet, &packet_len);
   if (!err) {
     packet[packet_len] = crc(packet, packet_len);
     packet_len++;
     queue_push(packet, packet_len);
   }
}

And here you don't have to think about multithreading issues (race 
condition, sharing resources, ...).

>> Normally the biggest part of the firmware is related to the 
>> application/tasks, not interrupts.
> 
> Except in the aforementioned foreground-background designs.

Even in foreground-background ISRs are limited to critical real-time events.

On 2020-01-15 23:12, Hans-Bernhard Br&#4294967295;ker wrote:
> Am 15.01.2020 um 20:26 schrieb Niklas Holsti:
> 
>> Having a single stack rather than per-task stacks is easier if one has
>> to program the high-water-mark method for measuring stack usage
>> separately and manually for each stack.
> 
> No.  It's exactly as easy.  You just do the same thing for each stack as
> you would for a single one.

In terms of intrinsic difficulty, yes. In terms of the amount of manual 
work, no.

>> Static analysis of stack usage bounds, either by a dedicated tool from
>> the machine code or as an extra function of the compiler and linker, is
>> really so simple
> 
> Objection.  It is simple only in a limited class of use cases.  As soon
> as there is any use of function pointers or the merest hint of possible
> recursion, static stack analysis becomes equivalent to Turing's Halting
> Problem.  Problems don't really get much harder than that one.

In theory, yes. In practice, no. The Halting Problem is impossible to 
solve for *all* programs. It doesn't mean that solutions are impossible 
for a *large* class of *common* programs.

Most programs are designed cleanly enough to make stack-usage analysis 
possible. An exact value is not needed; an upper bound is enough, if it 
isn't very pessimistic. Assuming that the type system is not broken, a 
whole-program analysis can figure out all possible target functions of 
any function pointer.

For recursion, I agree that some manual support (provide a bound on the 
depth of recursion) is usually needed. However, embedded programs rarely 
use recursion in any non-trivial way. And most functions have 
stack-frames of static size; it is rare to allocate dynamic amounts of 
stack.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Am 15.01.2020 um 20:26 schrieb Niklas Holsti:

> Having a single stack rather than per-task stacks is easier if one has
> to program the high-water-mark method for measuring stack usage
> separately and manually for each stack.

No.  It's exactly as easy.  You just do the same thing for each stack as
you would for a single one.

> Static analysis of stack usage bounds, either by a dedicated tool from
> the machine code or as an extra function of the compiler and linker, is
> really so simple

Objection.  It is simple only in a limited class of use cases.  As soon
as there is any use of function pointers or the merest hint of possible
recursion, static stack analysis becomes equivalent to Turing's Halting
Problem.  Problems don't really get much harder than that one.

Am 15.01.2020 um 09:36 schrieb pozz:
> Il 14/01/2020 20:10, Hans-Bernhard Br&#4294967295;ker ha scritto:
>> Am 14.01.2020 um 13:27 schrieb pozz:
>>> Another trouble I found with a preemptive RTOS is how to size the stack
>>> of tasks.
>>
>> That trouble is not at all particular to preemptive scheduling.
>>
>>> In my superloop cooperative scheduler:
>>
>>> the stack is assigned to all the unused memory available in the system
>>> (that is all the memory minus variables and heap size, if you use it).
>>
>> And how do you know that that's sufficient?
> 
> I'm not too smart to estimate the stack usage of a function (I know the
> compiler can produce some useful info about this, but you should add
> interrupt stack usage and so on), so my approach is only tests.

As the saying goes: testing proves diddly-squat.  You'll hardly ever
know whether your test cases came anywhere near exercising the code path
of worst stack usage.

>> The difficulty of (reliably) computing stack usage is the same,
>> regardless what tasking concept is used.
> 
> I don't agree.

You disagree to a different statement than the one I made.  I was
talking about this job being difficult. You object based on it being a
lot to work.

> Most of the time I can guess what is the task (or a few tasks) that
> consumes more stack. 

Guessing is about the only approach to this that is even less reliable
than testing.

> In preemptive scheduler, you would need to multiplicate your effort to
> estimate stack usage for every single task to avoid wasting precious
> memory.

The effort doesn't really multiply, because it'll all be handled by
loops in the code anyway.  Doing the same thing multiple times is what
CPUs are supposed to be good at, after all.

On 2020-01-14 14:27, pozz wrote:
> Another trouble I found with a preemptive RTOS is how to size the stack 
> of tasks.
> 
> In my superloop cooperative scheduler:
> 
>  &#4294967295; while(1) {
>  &#4294967295;&#4294967295;&#4294967295; task1();&#4294967295; // Non-blocking fast state-machined task
>  &#4294967295;&#4294967295;&#4294967295; task2();&#4294967295; // Non-blocking fast state-machined task
>  &#4294967295; }
> 
> the stack is assigned to all the unused memory available in the system 
> (that is all the memory minus variables and heap size, if you use it).
> 
> If two tasks use a stack-intensive function (such as a printf-like 
> function), this doesn't increase the overall stack requirement.
> For example, if the stack-intensive function needs 2kB of stack, the 
> global stack can be 2kB (more other stack needed by tasks for other 
> operations).
> 
> With a preemptive scheduler, tasks can be interrupted in any point, even 
> during printf-like function. So *each* task needs a stack of 2kB, 
> reaching a global stack requirement of 4kB.

That is quite true, if you allow pre-emption in the printf-like 
function. Also, depending on how your interrupt handlers are designed, 
you may have to allocate interrupt-handler space on each of the per-task 
stacks.

However, any computation that needs a lot of stack also tends to use a 
lot of time, so you might have to slice it up into a sequence of states 
(state machine approach) and then the data must be statically allocated 
rather than stack-allocated, as Stef pointed out, and moreover each task 
that uses that computation must have its own copy of the data.

> Another issue with preemptive approach is that you should be smart 
> enough to size N stacks (where N is the number of tasks).
> With the superloop architecture above, you should size only *one* global 
> stack, that can be calculated over one task, the one that needs more stack.

Having a single stack rather than per-task stacks is easier if one has 
to program the high-water-mark method for measuring stack usage 
separately and manually for each stack.

Static analysis of stack usage bounds, either by a dedicated tool from 
the machine code or as an extra function of the compiler and linker, is 
really so simple that I see little excuse for emmbedded-SW programming 
environments not to offer this analysis as a matter of course. And then 
per-task stacks are just as easy as one stack.

> Does this make sense?

Yes, the main-loop approach can use less stack than the pre-emptive 
approach.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

On 2020-01-10 1:32, pozz wrote:
> Il 09/01/2020 13:03, Niklas Holsti ha scritto:
>> On 2020-01-09 13:19, pozz wrote:
>>> Il 08/01/2020 23:54, Niklas Holsti ha scritto:
>>>> On 2020-01-08 1:02, pozz wrote:
>>>>> Il 07/01/2020 08:38, Niklas Holsti ha scritto:
>>
>> &#4294967295;&#4294967295;&#4294967295; [snip]
>>
>>>>>> Let's assume these requirements and properties of the environment:
>>>>>>
>>>>>> A. The function "serial_rx" polls the one-character reception 
>>>>>> buffer of the serial line once, and returns the received 
>>>>>> character, if any, and EOF otherwise. It must be called at least 
>>>>>> as often as characters arrive (that is, depending on baud rate) to 
>>>>>> avoid overrun and loss of some characters.
>>>>
>>>> You asked about possible advantages of pre-emption; I made my 
>>>> assumptions, above, such that the (incomplete) example you gave 
>>>> shows this advantage, under these assumptions (which could be true 
>>>> for other, otherwise similar example applications).
>>>>
>>>>> No, serial driver works in interrupt mode and already use a FIFO 
>>>>> buffer, sufficiently big. serial_rx() pop a single element from the 
>>>>> FIFO, if any.
>>>>
>>>> Ah, then your *system* is intrinsically pre-emptive (the interrupts 
>>>> pre-empt the tasks), even if the *code you showed* does not show 
>>>> this pre-emption.
>>>
>>> Ah yes, interrupts are preemptive and I use them a lot, but they are 
>>> confined in their works, they are very lightweight and fast.
>>
>> That's a design decision. Some systems do most of their work in 
>> interrupt handlers, and use "background" processing only for some 
>> non-critical housekeeping tasks.
>>
> 
> Yes, I imagine there are a pletora of possibilites. Anyway I thought 
> there was two typical approaches: superloop that continuously calls 
> non-blocking functions (an example of cooperative scheduler) and a full 
> preemptive scheduler (most of the time a full RTOS).

The approach where all real-time "tasks" are interrupts is called 
"foreground-background", I believe. It was popular in the microcomputer 
era, before real-time kernels became popular -- it gave most of the same 
benefits (pre-emption) but scheduled by HW (interrupt priorities).

> Interrupts are always present, even in superloop.

Again a design choice. Some critical systems don't want *any* 
pre-emption, even by interrupts -- simplicity reigns. So they poll, and 
accept the performance hit. For example, some colleagues of mine 
recently completed the "recovery SW" for ESA's ExoMars rover, which 
takes over if the "real" SW crashes and is intended to be super-safe and 
just support problem analysis and recovery. Using any kind of real-time 
kernel was forbidden -- only sequential coding was allowed. Any new, 
nice SW "feature" that my colleagues suggested, for example to speed up 
communication with Earth, was denied by the customer because it would 
complicate the SW a little bit -- perhaps one more conditional 
statement. No-no.

> Maybe I am wrong, but I implement ISRs with great care and attention,

Of course you are right to use great care and attention, and especially 
in ISRs.

> but they are very small and limited.

Many tasks in pre-emptive systems are also small and limited -- for 
example, one of my designs has a task whose main loop just takes a 
data-packet from an input queue, computes its CRC and appends it to the 
packet, and puts the updated packet in an output queue. A handful of 
source lines.

I would say that having a pre-emptive system relieves the designer of 
the great care and attention needed to slice computations into 
(state-mnachine-driven) "tasks" that execute their computations little 
by little.

> Normally the biggest part of the firmware is related to the 
> application/tasks, not interrupts.

Except in the aforementioned foreground-background designs.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .