EmbeddedRelated.com
Forums

C++ threads versus PThreads for embedded Linux on ARM micro

Started by gp.k...@gmail.com July 20, 2018
On 02/08/18 22:12, David Brown wrote:
> On 02/08/18 19:14, Tom Gardner wrote: >> On 02/08/18 17:56, upsidedown@downunder.com wrote: >>> All you need is the ability to have stacks in RAM and in addition >>> instructions for loading and storing the stack pointer from/to memory. >> >> Unfortunately that is difficult to implement in C, so youngsters >> don't think of it. >> > > This particular youngster grew up on small microcontrollers programmed in > assembly.
Ditto; neither of us are spring chickens. The first computer I designed, while still an undergrad, had 128 bytes of memoy. I still have that ic :)
> "All you need is the RAM" is not helpful when you have 512 bytes in > total, and the stack pointer is limited to accessing the first 128 bytes of > that.  I have worked on microcontrollers where the context switches involved in > handling an interrupt could easily take 50 µs or more - a proper RTOS would be > far too high overhead. > > Of course you can have more minimal OS's with very limited features (all the way > down to "protothreads"), and perhaps cooperative multi-tasking rather than > pre-emptive.  But then you are not going to have threads with multiple message > queues like the ones under discussion.
Protothreads always strike me as being a kludge that is "justified" by the limitations of the software tools rather than the hardware limitations.
On 02/08/18 22:18, David Brown wrote:
> On 02/08/18 20:21, Tom Gardner wrote: >> On 02/08/18 19:15, George Neuner wrote: >>> One serious problem is that too many programmers are much better at >>> figuring out what CAN be done in parallel than they are at figuring >>> out what SHOULD be done in parallel. >>> >>> Having too many threads generally is worse than having too few. >> >> :) >> >> For "embarrassingly parallel" applications such as telecom >> systems, 1-4 "worker" threads per core is a good starting >> point. > > Ah, that's a different matter.  Here you are talking about threads that do a > job, send out a result, and then close down.  (Usually for efficiency you have a > thread pool, and it is a "job" object that is activated, run, and closed down, > rather than the whole thread.  But logically, it is the same.)  It doesn't > matter which of these worker threads is running at any time, you simply want to > make efficient use of the cpu resources and have everything completed in the end.
That latter "job runs on a thread" is precisely the structure I've used, where a "job" is to process an event - and that processing can involve multiple machines made by companies/computers I don't know exist :) In realtime systems I've never had a case where a thread was spawned for a job, and then discarded. I'd be highly suspicious of any such architecture.
> In an RTOS we are usually talking about threads that need to be alive at the > same time, spend most of their time blocked somewhere, and which need to > communicate and be able to wake each other.  You typically only have one cpu > core, but you might have dozens of threads.
I dislike such architectures; it can be difficult to predict/monitor/log how computations are progressing - or more accurately not progressing. Logging FSM events and FSM states is a very powerful tool, and the mathematicians have spent a lot of time/effort in understanding and modelling their behaviour.
On Thu, 2 Aug 2018 23:12:02 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>On 02/08/18 19:14, Tom Gardner wrote: >> On 02/08/18 17:56, upsidedown@downunder.com wrote: >>> All you need is the ability to have stacks in RAM and in addition >>> instructions for loading and storing the stack pointer from/to memory. >> >> Unfortunately that is difficult to implement in C, so youngsters >> don't think of it. >> > >This particular youngster grew up on small microcontrollers programmed >in assembly. "All you need is the RAM" is not helpful when you have 512 >bytes in total, and the stack pointer is limited to accessing the first >128 bytes of that.
With 2-4 tasks, that is at least 32 bytes of stack per task. This needs to fit the task subroutine return addresses and space for saving task context (such as program counter, index reguster(s) and accumulator(s)) and additionally local variables used by the ISR. Should be doable, since subroutine parameters can be passed through the remaining 384 bytes.
>I have worked on microcontrollers where the context >switches involved in handling an interrupt could easily take 50 &#4294967295;s or >more - a proper RTOS would be far too high overhead. > >Of course you can have more minimal OS's with very limited features (all >the way down to "protothreads"), and perhaps cooperative multi-tasking >rather than pre-emptive. But then you are not going to have threads >with multiple message queues like the ones under discussion.
A simple RTOS just needs a fixed table for each task in priority order in which each element contains the task state and saved stack pointer. When e.g. the ISR wants to activate a specific task, it simply sets the target task state to READY. After that, scan the task table and find first task in READY state, load the saved stack pointer and execute a return from interrupt from the new stack to restore that task context. Routines for e.g. sending and receiving messages between tasks is just syntactic sugar :-).
On 18-08-03 00:22 , StateMachineCOM wrote:
> @David Brown: As they say: "you can lead a horse to water, but you > can't make it drink". I rest my case.
At least this one juror decides in favour of David B. All programs are state machines, but the "state" can be represented in various ways: as data ("reified") or as control flow ("sequential", "blocking"). Which is better depends on the problem to be solved; usually I end up with a mixture. Sometimes system constraints force one to use more data-state than is optimal, and the code becomes an awful mess. I have in mind my last project but one, where the SW controls several devices over a MIL-STD-1553 bus, with a cyclic, frame-based schedule, running several sporadically activated concurrent activites, each of which usually requires several carefully timed bus commands and responses, spread over several bus cycles. The nicest design would dedicate one thread to each such activity; after sending a command, the thread would wait (block) for the response, and would know, from its position in the algorithm, what to do with the response and what command to send next. But system constraints prohibit this number of threads, and thread switches, so all the state of the activities is reified into multiple state machines, activated once on every bus cycle, with switch/case statements that change the global state, increment counters, detect end of loops, retry failed commands, and so on and on. Yuck. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On 02/08/18 23:22, StateMachineCOM wrote:
> @David Brown: As they say: "you can lead a horse to water, but you can't make it drink". I rest my case. >
That applies equally in the other direction, of course. The discussion has been interesting, but I really don't think either you or the paper's author have done yourself justice here. I have not seen any clear arguments from you about /why/ you think event-driven threads with state machines are so much better - "proof by repeated assertion" does not wash. I have not seen any counters to my alternative suggestions, nor any solid reasoning why strict event-driven threads with state machines are somehow easier than more flexible solutions. And I certainly have not seen any good argument for why using /one/ tool should be so much better than having that tool as an option amongst several. I am of the opinion that having more options lets you pick better designs. (I actually think that in a great many cases where people use RTOS's, they would be better with a simpler non-OS design. I am not a believer in "everything should be RTOS".) But despite my clear opinions here, I think I could have come up with more and better arguments in your favour than you did. Still, as I say, it has been an interesting discussion in many ways, and it is good to see this sort of thing in the newsgroup. It has been too quite for too long.
On 03/08/18 00:52, Tom Gardner wrote:
> On 02/08/18 22:12, David Brown wrote: >> On 02/08/18 19:14, Tom Gardner wrote: >>> On 02/08/18 17:56, upsidedown@downunder.com wrote: >>>> All you need is the ability to have stacks in RAM and in addition >>>> instructions for loading and storing the stack pointer from/to memory. >>> >>> Unfortunately that is difficult to implement in C, so youngsters >>> don't think of it. >>> >> >> This particular youngster grew up on small microcontrollers programmed >> in assembly. > > Ditto; neither of us are spring chickens. > > The first computer I designed, while still an undergrad, had > 128 bytes of memoy. I still have that ic :)
I have never designed a chip - though in my teens I once designed key parts of a simple 4-bit cpu with a small instruction set. I drew it all out in two input NAND gates, on graph paper - including the single-cycle multiplier.
> >> "All you need is the RAM" is not helpful when you have 512 bytes in >> total, and the stack pointer is limited to accessing the first 128 >> bytes of that. I have worked on microcontrollers where the context >> switches involved in handling an interrupt could easily take 50 &micro;s or >> more - a proper RTOS would be far too high overhead. >> >> Of course you can have more minimal OS's with very limited features >> (all the way down to "protothreads"), and perhaps cooperative >> multi-tasking rather than pre-emptive. But then you are not going to >> have threads with multiple message queues like the ones under discussion. > > Protothreads always strike me as being a kludge that is "justified" > by the limitations of the software tools rather than the hardware > limitations.
They are a way to get convenient coding structures from very little software or hardware. I haven't had a use for them myself, but I suppose some people use them.
On 03/08/18 09:28, David Brown wrote:
> On 03/08/18 00:52, Tom Gardner wrote: >> On 02/08/18 22:12, David Brown wrote: >>> On 02/08/18 19:14, Tom Gardner wrote: >>>> On 02/08/18 17:56, upsidedown@downunder.com wrote: >>>>> All you need is the ability to have stacks in RAM and in addition >>>>> instructions for loading and storing the stack pointer from/to memory. >>>> >>>> Unfortunately that is difficult to implement in C, so youngsters >>>> don't think of it. >>>> >>> >>> This particular youngster grew up on small microcontrollers programmed >>> in assembly. >> >> Ditto; neither of us are spring chickens. >> >> The first computer I designed, while still an undergrad, had >> 128 bytes of memoy. I still have that ic :) > > I have never designed a chip - though in my teens I once designed key > parts of a simple 4-bit cpu with a small instruction set. I drew it all > out in two input NAND gates, on graph paper - including the single-cycle > multiplier.
Mine was 6800 based. Just about everything was "suboptimal" - except that it worked and I learned a heck of a lot. I later designed a single-purpose machine using 2900 bit-slices, but it was never implemented.
> >> >>> "All you need is the RAM" is not helpful when you have 512 bytes in >>> total, and the stack pointer is limited to accessing the first 128 >>> bytes of that. I have worked on microcontrollers where the context >>> switches involved in handling an interrupt could easily take 50 &micro;s or >>> more - a proper RTOS would be far too high overhead. >>> >>> Of course you can have more minimal OS's with very limited features >>> (all the way down to "protothreads"), and perhaps cooperative >>> multi-tasking rather than pre-emptive. But then you are not going to >>> have threads with multiple message queues like the ones under discussion. >> >> Protothreads always strike me as being a kludge that is "justified" >> by the limitations of the software tools rather than the hardware >> limitations. > > They are a way to get convenient coding structures from very little > software or hardware. I haven't had a use for them myself, but I > suppose some people use them.
That's my (lack of) experience, but the requirement that all context switches are in the top level code (i.e. not a function) doesn't strike me as convenient. I know why it is "necessary", but that doesn't change the inconvenience. Back in 1982 I was using cooperative multitasking with C threads on a Z80 (or PDP11 or whatever was convenient), with a little bit of assembler to save/restore the stacks and other context. That always seemed pretty natural - provided I used message passing (with timeouts) for large-scale flow control.
On 03/08/18 12:21, Tom Gardner wrote:
> On 03/08/18 09:28, David Brown wrote: >> On 03/08/18 00:52, Tom Gardner wrote: >>> On 02/08/18 22:12, David Brown wrote: >>>> On 02/08/18 19:14, Tom Gardner wrote: >>>>> On 02/08/18 17:56, upsidedown@downunder.com wrote: >>>>>> All you need is the ability to have stacks in RAM and in addition >>>>>> instructions for loading and storing the stack pointer from/to >>>>>> memory. >>>>> >>>>> Unfortunately that is difficult to implement in C, so youngsters >>>>> don't think of it. >>>>> >>>> >>>> This particular youngster grew up on small microcontrollers programmed >>>> in assembly. >>> >>> Ditto; neither of us are spring chickens. >>> >>> The first computer I designed, while still an undergrad, had >>> 128 bytes of memoy. I still have that ic :) >> >> I have never designed a chip - though in my teens I once designed key >> parts of a simple 4-bit cpu with a small instruction set. I drew it all >> out in two input NAND gates, on graph paper - including the single-cycle >> multiplier. > > Mine was 6800 based. Just about everything was "suboptimal" - except > that it worked and I learned a heck of a lot. > > I later designed a single-purpose machine using 2900 bit-slices, > but it was never implemented. > >> >>> >>>> "All you need is the RAM" is not helpful when you have 512 bytes in >>>> total, and the stack pointer is limited to accessing the first 128 >>>> bytes of that. I have worked on microcontrollers where the context >>>> switches involved in handling an interrupt could easily take 50 &micro;s or >>>> more - a proper RTOS would be far too high overhead. >>>> >>>> Of course you can have more minimal OS's with very limited features >>>> (all the way down to "protothreads"), and perhaps cooperative >>>> multi-tasking rather than pre-emptive. But then you are not going to >>>> have threads with multiple message queues like the ones under >>>> discussion. >>> >>> Protothreads always strike me as being a kludge that is "justified" >>> by the limitations of the software tools rather than the hardware >>> limitations. >> >> They are a way to get convenient coding structures from very little >> software or hardware. I haven't had a use for them myself, but I >> suppose some people use them. > > That's my (lack of) experience, but the requirement that all > context switches are in the top level code (i.e. not a function) > doesn't strike me as convenient. I know why it is "necessary", > but that doesn't change the inconvenience.
I agree entirely - that is one of the main things that puts me off about them. But others may find the benefits are worth it.
> > Back in 1982 I was using cooperative multitasking with C threads > on a Z80 (or PDP11 or whatever was convenient), with a little > bit of assembler to save/restore the stacks and other context. > That always seemed pretty natural - provided I used message > passing (with timeouts) for large-scale flow control.
On Thu, 2 Aug 2018 18:14:55 +0100, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

>On 02/08/18 17:56, upsidedown@downunder.com wrote: >> All you need is the ability to have stacks in RAM and in addition >> instructions for loading and storing the stack pointer from/to memory. > >Unfortunately that is difficult to implement in C, so youngsters >don't think of it.
It can get a little messy, certainly, but really, task switching in C is not that hard provided the jmp_buf structure is documented. George
On Thu, 2 Aug 2018 23:13:27 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>On 02/08/18 19:23, Robert Wessel wrote: >> On Thu, 02 Aug 2018 16:14:47 +0300, upsidedown@downunder.com wrote: >> >>> On Thu, 02 Aug 2018 12:24:49 +0200, David Brown >>> <david.brown@hesbynett.no> wrote: >>> >>> Windows NT 3.5 and later on was a full blown priority based >>> multitasking system with close resemblance to RSX-11 and VMS. >> >> Windows NT 3.1, actually. >> > >Yes, of course. Win NT 3.1 was rarely seen in the wild, but it did exist.
And the reason it bore a resemblence to RSX-11 and VMS was because many of the people who wrote it were hired away from DEC. George