C++ threads versus PThreads for embedded Linux on ARM micro| page 5

Reply by Tom Gardner ●August 2, 20182018-08-02

On 02/08/18 22:12, David Brown wrote:
> On 02/08/18 19:14, Tom Gardner wrote:
>> On 02/08/18 17:56, upsidedown@downunder.com wrote:
>>> All you need is the ability to have stacks in RAM and in addition
>>> instructions for loading and storing the stack pointer from/to memory.
>>
>> Unfortunately that is difficult to implement in C, so youngsters
>> don't think of it.
>>
> 
> This particular youngster grew up on small microcontrollers programmed in 
> assembly.  

Ditto; neither of us are spring chickens.

The first computer I designed, while still an undergrad, had
128 bytes of memoy. I still have that ic :)

> "All you need is the RAM" is not helpful when you have 512 bytes in 
> total, and the stack pointer is limited to accessing the first 128 bytes of 
> that.&nbsp; I have worked on microcontrollers where the context switches involved in 
> handling an interrupt could easily take 50 &micro;s or more - a proper RTOS would be 
> far too high overhead.
> 
> Of course you can have more minimal OS's with very limited features (all the way 
> down to "protothreads"), and perhaps cooperative multi-tasking rather than 
> pre-emptive.&nbsp; But then you are not going to have threads with multiple message 
> queues like the ones under discussion.

Protothreads always strike me as being a kludge that is "justified"
by the limitations of the software tools rather than the hardware
limitations.

Reply by Tom Gardner ●August 2, 20182018-08-02

On 02/08/18 22:18, David Brown wrote:
> On 02/08/18 20:21, Tom Gardner wrote:
>> On 02/08/18 19:15, George Neuner wrote:
>>> One serious problem is that too many programmers are much better at
>>> figuring out what CAN be done in parallel than they are at figuring
>>> out what SHOULD be done in parallel.
>>>
>>> Having too many threads generally is worse than having too few.
>>
>> :)
>>
>> For "embarrassingly parallel" applications such as telecom
>> systems, 1-4 "worker" threads per core is a good starting
>> point.
> 
> Ah, that's a different matter.&nbsp; Here you are talking about threads that do a 
> job, send out a result, and then close down.&nbsp; (Usually for efficiency you have a 
> thread pool, and it is a "job" object that is activated, run, and closed down, 
> rather than the whole thread.&nbsp; But logically, it is the same.)&nbsp; It doesn't 
> matter which of these worker threads is running at any time, you simply want to 
> make efficient use of the cpu resources and have everything completed in the end.

That latter "job runs on a thread" is precisely the structure
I've used, where a "job" is to process an event - and that
processing can involve multiple machines made by companies/computers
I don't know exist :)

In realtime systems I've never had a case where a thread was
spawned for a job, and then discarded. I'd be highly suspicious
of any such architecture.


> In an RTOS we are usually talking about threads that need to be alive at the 
> same time, spend most of their time blocked somewhere, and which need to 
> communicate and be able to wake each other.&nbsp; You typically only have one cpu 
> core, but you might have dozens of threads. 

I dislike such architectures; it can be difficult to predict/monitor/log
how computations are progressing - or more accurately not progressing.
Logging FSM events and FSM states is a very powerful tool, and the
mathematicians have spent a lot of time/effort in understanding and
modelling their behaviour.

Reply by ●August 3, 20182018-08-03

On Thu, 2 Aug 2018 23:12:02 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>On 02/08/18 19:14, Tom Gardner wrote:
>> On 02/08/18 17:56, upsidedown@downunder.com wrote:
>>> All you need is the ability to have stacks in RAM and in addition
>>> instructions for loading and storing the stack pointer from/to memory.
>> 
>> Unfortunately that is difficult to implement in C, so youngsters
>> don't think of it.
>> 
>
>This particular youngster grew up on small microcontrollers programmed 
>in assembly.  "All you need is the RAM" is not helpful when you have 512 
>bytes in total, and the stack pointer is limited to accessing the first 
>128 bytes of that.  

With 2-4 tasks, that is at least 32 bytes of stack per task. This
needs to fit the task subroutine return addresses and space for saving
task context (such as program counter, index reguster(s) and
accumulator(s)) and additionally local variables used by the ISR.
Should be doable, since subroutine parameters can be passed through
the remaining 384 bytes.

>I have worked on microcontrollers where the context 
>switches involved in handling an interrupt could easily take 50 &#4294967295;s or 
>more - a proper RTOS would be far too high overhead.
>
>Of course you can have more minimal OS's with very limited features (all 
>the way down to "protothreads"), and perhaps cooperative multi-tasking 
>rather than pre-emptive.  But then you are not going to have threads 
>with multiple message queues like the ones under discussion.

A simple RTOS just needs a fixed table for each task in priority order
in which each element contains the task state and saved stack pointer.

When e.g. the ISR wants to activate a specific task, it simply sets
the target task state to READY. After that, scan the task table and
find first task in READY state, load the saved stack  pointer and
execute a return from interrupt from the new stack to restore that
task context.

Routines for e.g. sending and receiving messages between tasks is just
syntactic sugar :-).

Reply by Niklas Holsti ●August 3, 20182018-08-03

On 18-08-03 00:22 , StateMachineCOM wrote:
> @David Brown: As they say: "you can lead a horse to water, but you
> can't make it drink". I rest my case.

At least this one juror decides in favour of David B.

All programs are state machines, but the "state" can be represented in 
various ways: as data ("reified") or as control flow ("sequential", 
"blocking"). Which is better depends on the problem to be solved; 
usually I end up with a mixture.

Sometimes system constraints force one to use more data-state than is 
optimal, and the code becomes an awful mess. I have in mind my last 
project but one, where the SW controls several devices over a 
MIL-STD-1553 bus, with a cyclic, frame-based schedule, running several 
sporadically activated concurrent activites, each of which usually 
requires several carefully timed bus commands and responses, spread over 
several bus cycles.

The nicest design would dedicate one thread to each such activity; after 
sending a command, the thread would wait (block) for the response, and 
would know, from its position in the algorithm, what to do with the 
response and what command to send next. But system constraints prohibit 
this number of threads, and thread switches, so all the state of the 
activities is reified into multiple state machines, activated once on 
every bus cycle, with switch/case statements that change the global 
state, increment counters, detect end of loops, retry failed commands, 
and so on and on. Yuck.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Reply by David Brown ●August 3, 20182018-08-03

On 02/08/18 23:22, StateMachineCOM wrote:
> @David Brown: As they say: "you can lead a horse to water, but you can't make it drink". I rest my case.
> 

That applies equally in the other direction, of course.

The discussion has been interesting, but I really don't think either you
or the paper's author have done yourself justice here.

I have not seen any clear arguments from you about /why/ you think
event-driven threads with state machines are so much better - "proof by
repeated assertion" does not wash.  I have not seen any counters to my
alternative suggestions, nor any solid reasoning why strict event-driven
threads with state machines are somehow easier than more flexible
solutions.  And I certainly have not seen any good argument for why
using /one/ tool should be so much better than having that tool as an
option amongst several.

I am of the opinion that having more options lets you pick better
designs.  (I actually think that in a great many cases where people use
RTOS's, they would be better with a simpler non-OS design.  I am not a
believer in "everything should be RTOS".)  But despite my clear opinions
here, I think I could have come up with more and better arguments in
your favour than you did.

Still, as I say, it has been an interesting discussion in many ways, and
it is good to see this sort of thing in the newsgroup.  It has been too
quite for too long.

Reply by David Brown ●August 3, 20182018-08-03

On 03/08/18 00:52, Tom Gardner wrote:
> On 02/08/18 22:12, David Brown wrote:
>> On 02/08/18 19:14, Tom Gardner wrote:
>>> On 02/08/18 17:56, upsidedown@downunder.com wrote:
>>>> All you need is the ability to have stacks in RAM and in addition
>>>> instructions for loading and storing the stack pointer from/to memory.
>>>
>>> Unfortunately that is difficult to implement in C, so youngsters
>>> don't think of it.
>>>
>>
>> This particular youngster grew up on small microcontrollers programmed
>> in assembly.  
> 
> Ditto; neither of us are spring chickens.
> 
> The first computer I designed, while still an undergrad, had
> 128 bytes of memoy. I still have that ic :)

I have never designed a chip - though in my teens I once designed key
parts of a simple 4-bit cpu with a small instruction set.  I drew it all
out in two input NAND gates, on graph paper - including the single-cycle
multiplier.

> 
>> "All you need is the RAM" is not helpful when you have 512 bytes in
>> total, and the stack pointer is limited to accessing the first 128
>> bytes of that.  I have worked on microcontrollers where the context
>> switches involved in handling an interrupt could easily take 50 &micro;s or
>> more - a proper RTOS would be far too high overhead.
>>
>> Of course you can have more minimal OS's with very limited features
>> (all the way down to "protothreads"), and perhaps cooperative
>> multi-tasking rather than pre-emptive.  But then you are not going to
>> have threads with multiple message queues like the ones under discussion.
> 
> Protothreads always strike me as being a kludge that is "justified"
> by the limitations of the software tools rather than the hardware
> limitations.

They are a way to get convenient coding structures from very little
software or hardware.  I haven't had a use for them myself, but I
suppose some people use them.

Reply by Tom Gardner ●August 3, 20182018-08-03

On 03/08/18 09:28, David Brown wrote:
> On 03/08/18 00:52, Tom Gardner wrote:
>> On 02/08/18 22:12, David Brown wrote:
>>> On 02/08/18 19:14, Tom Gardner wrote:
>>>> On 02/08/18 17:56, upsidedown@downunder.com wrote:
>>>>> All you need is the ability to have stacks in RAM and in addition
>>>>> instructions for loading and storing the stack pointer from/to memory.
>>>>
>>>> Unfortunately that is difficult to implement in C, so youngsters
>>>> don't think of it.
>>>>
>>>
>>> This particular youngster grew up on small microcontrollers programmed
>>> in assembly.
>>
>> Ditto; neither of us are spring chickens.
>>
>> The first computer I designed, while still an undergrad, had
>> 128 bytes of memoy. I still have that ic :)
> 
> I have never designed a chip - though in my teens I once designed key
> parts of a simple 4-bit cpu with a small instruction set.  I drew it all
> out in two input NAND gates, on graph paper - including the single-cycle
> multiplier.

Mine was 6800 based. Just about everything was "suboptimal" - except
that it worked and I learned a heck of a lot.

I later designed a single-purpose machine using 2900 bit-slices,
but it was never implemented.

> 
>>
>>> "All you need is the RAM" is not helpful when you have 512 bytes in
>>> total, and the stack pointer is limited to accessing the first 128
>>> bytes of that.  I have worked on microcontrollers where the context
>>> switches involved in handling an interrupt could easily take 50 &micro;s or
>>> more - a proper RTOS would be far too high overhead.
>>>
>>> Of course you can have more minimal OS's with very limited features
>>> (all the way down to "protothreads"), and perhaps cooperative
>>> multi-tasking rather than pre-emptive.  But then you are not going to
>>> have threads with multiple message queues like the ones under discussion.
>>
>> Protothreads always strike me as being a kludge that is "justified"
>> by the limitations of the software tools rather than the hardware
>> limitations.
> 
> They are a way to get convenient coding structures from very little
> software or hardware.  I haven't had a use for them myself, but I
> suppose some people use them.

That's my (lack of) experience, but the requirement that all
context switches are in the top level code (i.e. not a function)
doesn't strike me as convenient. I know why it is "necessary",
but that doesn't change the inconvenience.

Back in 1982 I was using cooperative multitasking with C threads
on a Z80 (or PDP11 or whatever was convenient), with a little
bit of assembler to save/restore the stacks and other context.
That always seemed pretty natural - provided I used message
passing (with timeouts) for large-scale flow control.

Reply by David Brown ●August 3, 20182018-08-03

On 03/08/18 12:21, Tom Gardner wrote:
> On 03/08/18 09:28, David Brown wrote:
>> On 03/08/18 00:52, Tom Gardner wrote:
>>> On 02/08/18 22:12, David Brown wrote:
>>>> On 02/08/18 19:14, Tom Gardner wrote:
>>>>> On 02/08/18 17:56, upsidedown@downunder.com wrote:
>>>>>> All you need is the ability to have stacks in RAM and in addition
>>>>>> instructions for loading and storing the stack pointer from/to
>>>>>> memory.
>>>>>
>>>>> Unfortunately that is difficult to implement in C, so youngsters
>>>>> don't think of it.
>>>>>
>>>>
>>>> This particular youngster grew up on small microcontrollers programmed
>>>> in assembly.
>>>
>>> Ditto; neither of us are spring chickens.
>>>
>>> The first computer I designed, while still an undergrad, had
>>> 128 bytes of memoy. I still have that ic :)
>>
>> I have never designed a chip - though in my teens I once designed key
>> parts of a simple 4-bit cpu with a small instruction set.  I drew it all
>> out in two input NAND gates, on graph paper - including the single-cycle
>> multiplier.
> 
> Mine was 6800 based. Just about everything was "suboptimal" - except
> that it worked and I learned a heck of a lot.
> 
> I later designed a single-purpose machine using 2900 bit-slices,
> but it was never implemented.
> 
>>
>>>
>>>> "All you need is the RAM" is not helpful when you have 512 bytes in
>>>> total, and the stack pointer is limited to accessing the first 128
>>>> bytes of that.  I have worked on microcontrollers where the context
>>>> switches involved in handling an interrupt could easily take 50 &micro;s or
>>>> more - a proper RTOS would be far too high overhead.
>>>>
>>>> Of course you can have more minimal OS's with very limited features
>>>> (all the way down to "protothreads"), and perhaps cooperative
>>>> multi-tasking rather than pre-emptive.  But then you are not going to
>>>> have threads with multiple message queues like the ones under
>>>> discussion.
>>>
>>> Protothreads always strike me as being a kludge that is "justified"
>>> by the limitations of the software tools rather than the hardware
>>> limitations.
>>
>> They are a way to get convenient coding structures from very little
>> software or hardware.  I haven't had a use for them myself, but I
>> suppose some people use them.
> 
> That's my (lack of) experience, but the requirement that all
> context switches are in the top level code (i.e. not a function)
> doesn't strike me as convenient. I know why it is "necessary",
> but that doesn't change the inconvenience.

I agree entirely - that is one of the main things that puts me off about
them.  But others may find the benefits are worth it.

> 
> Back in 1982 I was using cooperative multitasking with C threads
> on a Z80 (or PDP11 or whatever was convenient), with a little
> bit of assembler to save/restore the stacks and other context.
> That always seemed pretty natural - provided I used message
> passing (with timeouts) for large-scale flow control.

Reply by George Neuner ●August 4, 20182018-08-04

On Thu, 2 Aug 2018 18:14:55 +0100, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

>On 02/08/18 17:56, upsidedown@downunder.com wrote:
>> All you need is the ability to have stacks in RAM and in addition
>> instructions for loading and storing the stack pointer from/to memory.
>
>Unfortunately that is difficult to implement in C, so youngsters
>don't think of it.

It can get a little messy, certainly, but really, task switching in C
is not that hard provided the jmp_buf structure is documented.

George

Reply by George Neuner ●August 4, 20182018-08-04

On Thu, 2 Aug 2018 23:13:27 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>On 02/08/18 19:23, Robert Wessel wrote:
>> On Thu, 02 Aug 2018 16:14:47 +0300, upsidedown@downunder.com wrote:
>> 
>>> On Thu, 02 Aug 2018 12:24:49 +0200, David Brown
>>> <david.brown@hesbynett.no> wrote:
>>>
>>> Windows NT 3.5 and later on was a full blown priority based
>>> multitasking system with close resemblance to RSX-11 and VMS.
>> 
>> Windows NT 3.1, actually.
>> 
>
>Yes, of course.  Win NT 3.1 was rarely seen in the wild, but it did exist.

And the reason it bore a resemblence to RSX-11 and VMS was because
many of the people who wrote it were hired away from DEC.

George

Previous 3 45Next

C++ threads versus PThreads for embedded Linux on ARM micro

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group