EmbeddedRelated.com
Forums
Memfault Beyond the Launch

C++ threads versus PThreads for embedded Linux on ARM micro

Started by gp.k...@gmail.com July 20, 2018
Traditional threads, whichever way you package them (as C++ threads, p-threads or any other thread library), typically correspond to the "shared-state concurrency and blocking" approach. This approach is known to be problematic, and many experts in concurrent programming recommend to drastically limit both sharing and blocking according to the following three best practices:

1. Keep data isolated and bound to threads. Threads should hide (encapsulate) their private data and other resources, and not share them with the rest of the system.

2. Communicate among threads asynchronously via messages (event objects). Using asynchronous events keeps the threads running truly independently, without any further blocking on each other.

3. Threads should spend their lifetime responding to incoming events, so their mainline should consist of an event-loop that handles events one at a time (to completion), thus avoiding any concurrency hazards within a thread itself.

The set of these best practices are collectively known as the Active Object design pattern (a.k.a. Actor). While this pattern can be applied manually on top of a traditional threads, a better way is to use an Active Object framework.

The main difference is that when you use "naked" threads, you write the main body of the application (such as the thread routines for all your tasks) and you call various thread-library services (e.g., a semaphore or a time delay). When you use a framework, you reuse the overall architecture and write the code that it calls. This leads to inversion of control, which allows the framework to automatically enforce the best practices of concurrent programming. In contrast, a "naked" threads let you do anything and offer no help or automation for the best practices.
Thanks.  What is an "event object"?  What is the best way to pass data asynchronously using a queue, on Linux?  I've read that lock-free data structures are easy to get wrong and best avoided and that the C++ thread library doesn't have any lock free data structures - mainly because there's too many variations to have a generalized data structure.

Can we use the "libcds" library and be confident that it will work correctly?

https://github.com/khizmax/libcds


On Saturday, July 28, 2018 at 3:52:02 AM UTC+12, StateMachineCOM wrote:
> Traditional threads, whichever way you package them (as C++ threads, p-threads or any other thread library), typically correspond to the "shared-state concurrency and blocking" approach. This approach is known to be problematic, and many experts in concurrent programming recommend to drastically limit both sharing and blocking according to the following three best practices:
[snip]
"gp.kiwi@gmail.com" <graeme.prentice@gmail.com> writes:
> Thanks. What is an "event object"?
Just a message that you pass from one thread to another.
> What is the best way to pass data asynchronously using a queue, on > Linux?
I'd probably just use std::deque with a lock. You might look at seastar-project.org for some inspiration.
On Friday, July 27, 2018 at 5:53:28 PM UTC-4, gp....@gmail.com wrote:
> What is an "event object"? What is the best way to pass data asynchronously using a queue, on Linux?
I can tell you how this is done in the QP/C++ framework, which I've designed and refined for almost two decades now. But before I can get to the technical, I need to make full disclosure that QP is a dual-licensed (open-source/commercial) product of my company (see https://www.state-machine.com), so I do have a commercial interest in promoting it. So, now going back to your question, "event objects" are messages that threads send to each other via event queues. But a naive implementation of copying messages to and from the queues is expensive and hurts real-time performance. So, in the QP framework, the events are allocated from fixed-size pools and only pointers to events are kept in the event queues. The framework maintains the copy-by-value semantics as much as possible, while event objects are really shared under the hood. The framework also automatically recycles events that are processed. Specifically to the POSIX port of QP/C++, which has been available for over 15 years now, each active object runs in its own p-thread. These threads are organized as an event-loop (according to the best practice I listed in my previous thread), so they block only in one place--when the event queue is empty. The queue uses internally a p-thread mutex and a condition variable to implement blocking on an event queue and signaling the queue. But the application programmer does not need to know any of it, because the main point is that the framework does the heavy lifting of thread-safe asynchronous event exchange. The application threads (active objects) only process the events one at a time (to completion), but they don't need to worry about any low-level mechanisms like mutexs or condition variables. The design also allows you to avoid sharing of anything (except events) among the threads, which is another best-practice of concurrent programming. This means that you don't need to use any synchronization objects. In this sense, the RAII benefits of synchronization mechanisms in the C++ threads don't matter. There is of course much more to active object framework like QP/C++ to capture here. For example, the framework supports Hierarchical State Machines to implement the internal behavior of active objects. There is also a free modeling tool (QM), with which you can design your HSMs graphically and generate production-code automatically. But all of this requires a paradigm shift from the traditional sequential-programming with blocking to event-driven programming without blocking or sharing. To learn more, you might read about the key concepts here: https://www.state-machine.com/doc/concepts Miro Samek state-machine.com
On 27/07/18 17:51, StateMachineCOM wrote:
> Traditional threads, whichever way you package them (as C++ threads, > p-threads or any other thread library), typically correspond to the > "shared-state concurrency and blocking" approach. This approach is > known to be problematic, and many experts in concurrent programming > recommend to drastically limit both sharing and blocking according to > the following three best practices: > > 1. Keep data isolated and bound to threads. Threads should hide > (encapsulate) their private data and other resources, and not share > them with the rest of the system.
Encapsulation is always a good principle, but don't take it too far. If two parts of the system need to share data of significant size, then you want shared data, not "messages" or other synchronisation mechanisms. (You use the messages or other synchronisation to communicate metadata - such as who owns the real data space at any given time - but not the data itself.)
> > 2. Communicate among threads asynchronously via messages (event > objects). Using asynchronous events keeps the threads running truly > independently, without any further blocking on each other. >
Blocking is fine with threads. If you have a single core cpu - or more threads than cores - then blocking is often more efficient than attempting to continue. After all, if thread A is asking thread B to do something (via a message, actor call, or whatever) then thread B can't get started in doing the work A wants until A has taken a break. It's cheaper to have a voluntary break (yield, or blocking call) than to wait for a scheduling change. So use blocking calls whenever they fit naturally in the progression of the code - and non-blocking calls whenever /that/ is the more natural fit. Don't make the mistake of thinking that one is inherently "better" or necessarily more efficient - /measure/ the /real/ effects if efficiency is vital.
> 3. Threads should spend their lifetime responding to incoming events, > so their mainline should consist of an event-loop that handles events > one at a time (to completion), thus avoiding any concurrency hazards > within a thread itself. > > The set of these best practices are collectively known as the Active > Object design pattern (a.k.a. Actor). While this pattern can be > applied manually on top of a traditional threads, a better way is to > use an Active Object framework.
Actor designs can certainly have their advantages - equally certainly, they are not the best design for all uses. Whenever someone says "this is the best way to do it", it's unlikely to be that simple - and whenever they say so without knowing exact details of the problem at hand, they are almost certainly wrong.
> > The main difference is that when you use "naked" threads, you write > the main body of the application (such as the thread routines for all > your tasks) and you call various thread-library services (e.g., a > semaphore or a time delay). When you use a framework, you reuse the > overall architecture and write the code that it calls. This leads to > inversion of control, which allows the framework to automatically > enforce the best practices of concurrent programming. In contrast, a > "naked" threads let you do anything and offer no help or automation > for the best practices. >
On 27/07/18 23:53, gp.kiwi@gmail.com wrote:
> Thanks. What is an "event object"? What is the best way to pass > data asynchronously using a queue, on Linux? I've read that > lock-free data structures are easy to get wrong and best avoided and > that the C++ thread library doesn't have any lock free data > structures - mainly because there's too many variations to have a > generalized data structure.
Lock-free data structures can range from very simple to very difficult, and there can be huge differences depending on the details of the structure. A single-writer, single-reader fixed size queue is /easy/ - it's just two atomic counters for "head" and "tail" and an array, with a little care about memory ordering. For single core embedded processors, it's usually sufficient to just use "volatile" - for bigger systems, C++11 or C11 atomics handle the details. On the other hand, a queue that can have variable size, and more than one reader or writer, quickly gets really complicated to handle lock-free, and often it is much simpler, safer and cheaper to use a lock. On the third hand, if you have multiple cores you might want lock-free again for scalability. There is no simple answer here, and much depends on the details of exactly what you are wanting. As long as you ask general questions, you'll only get general answers.
> > Can we use the "libcds" library and be confident that it will work > correctly? > > https://github.com/khizmax/libcds > > > On Saturday, July 28, 2018 at 3:52:02 AM UTC+12, StateMachineCOM > wrote: >> Traditional threads, whichever way you package them (as C++ >> threads, p-threads or any other thread library), typically >> correspond to the "shared-state concurrency and blocking" approach. >> This approach is known to be problematic, and many experts in >> concurrent programming recommend to drastically limit both sharing >> and blocking according to the following three best practices: > [snip] >
@David Brown: Absolutely, if you stick to the traditional sequential programming paradigm with shared-state concurrency and blocking threads, the three best practices I listed in my previous post can all be questioned, relaxed, and ultimately dismissed.

That's because they represent a different, event-driven ("reactive") programming paradigm. The distinction is important, because the two programming paradigms do NOT mix well, certainly not inside the same thread. So it is important to always realize which paradigm you are using in which thread, to avoid confusion and mixing the two.

To back up this point, I'd like to recommend the article "Managing Concurrency in Complex Embedded Systems" by David Cummings (http://www.kellytechnologygroup.com/main/concurrent-embedded-systems-website.pdf ). The author presents general guiding principles of structuring threads, which he found particularly useful and which he applied in the NASA Mars rovers and other mission-critical systems. The paper starts with the description of the general thread structure, which can be immediately recognized as the event-loop. The bulk of the paper then focuses on discussing several scenarios in which designers might be tempted to apply thread BLOCKING, followed by explanations why blocking is always a BAD idea. Again, I repeat, that this conclusion applies to the "event-driven" thread structure, which the author started with.
On 01/08/18 00:16, StateMachineCOM wrote:
> @David Brown: Absolutely, if you stick to the traditional sequential programming paradigm with shared-state concurrency and blocking threads, the three best practices I listed in my previous post can all be questioned, relaxed, and ultimately dismissed. > > That's because they represent a different, event-driven ("reactive") programming paradigm. The distinction is important, because the two programming paradigms do NOT mix well, certainly not inside the same thread. So it is important to always realize which paradigm you are using in which thread, to avoid confusion and mixing the two. > > To back up this point, I'd like to recommend the article "Managing Concurrency in Complex Embedded Systems" by David Cummings (http://www.kellytechnologygroup.com/main/concurrent-embedded-systems-website.pdf ). The author presents general guiding principles of structuring threads, which he found particularly useful and which he applied in the NASA Mars rovers and other mission-critical systems. The paper starts with the description of the general thread structure, which can be immediately recognized as the event-loop. The bulk of the paper then focuses on discussing several scenarios in which designers might be tempted to apply thread BLOCKING, followed by explanations why blocking is always a BAD idea. Again, I repeat, that this conclusion applies to the "event-driven" thread structure, which the author started with. >
I haven't read the link yet (I will do so), but I do agree that blocking is a very bad idea in an event-driven thread.
Il 01/08/2018 00:16, StateMachineCOM ha scritto:
> @David Brown: Absolutely, if you stick to the traditional sequential programming paradigm with shared-state concurrency and blocking threads, the three best practices I listed in my previous post can all be questioned, relaxed, and ultimately dismissed. > > That's because they represent a different, event-driven ("reactive") programming paradigm. The distinction is important, because the two programming paradigms do NOT mix well, certainly not inside the same thread. So it is important to always realize which paradigm you are using in which thread, to avoid confusion and mixing the two. > > To back up this point, I'd like to recommend the article "Managing Concurrency in Complex Embedded Systems" by David Cummings (http://www.kellytechnologygroup.com/main/concurrent-embedded-systems-website.pdf ).
Thanks for this reference... it is a very *very* instructive material.
On Tue, 31 Jul 2018 15:16:07 -0700 (PDT), StateMachineCOM
<statemachineguru@gmail.com> wrote:

>@David Brown: Absolutely, if you stick to the traditional sequential programming paradigm with shared-state concurrency and blocking threads, the three best practices I listed in my previous post can all be questioned, relaxed, and ultimately dismissed. > >That's because they represent a different, event-driven ("reactive") programming paradigm. The distinction is important, because the two programming paradigms do NOT mix well, certainly not inside the same thread. So it is important to always realize which paradigm you are using in which thread, to avoid confusion and mixing the two. > >To back up this point, I'd like to recommend the article "Managing Concurrency in Complex Embedded Systems" by David Cummings (http://www.kellytechnologygroup.com/main/concurrent-embedded-systems-website.pdf ). The author presents general guiding principles of structuring threads, which he found particularly useful and which he applied in the NASA Mars rovers and other mission-critical systems. The paper starts with the description of the general thread structure, which can be immediately recognized as the event-loop. The bulk of the paper then focuses on discussing several scenarios in which designers might be tempted to apply thread BLOCKING, followed by explanations why blocking is always a BAD idea. Again, I repeat, that this conclusion applies to the "event-driven" thread structure, which the author started with.
It seems Cummings has reinvented the wheel :-). Those principles were used already in the 1970's to implement real time systems under RSX-11 on PDP-11. Later on these principles were also used on real time systems under RMX-80 for 8080 and similar kernels.

Memfault Beyond the Launch