C++ threads versus PThreads for embedded Linux on ARM micro

We're starting an embedded Linux C++ project with an ARM micro and using GCC V7.  Can anyone suggest pros and cons of using C++ Threads versus PThreads (Posix threads).

Reply by David Brown ●July 20, 20182018-07-20

On 20/07/18 13:01, gp.kiwi@gmail.com wrote:
> We're starting an embedded Linux C++ project with an ARM micro and using GCC V7.  Can anyone suggest pros and cons of using C++ Threads versus PThreads (Posix threads).
> 

C++ threads are always a wrapper around an underlying library.  So if 
you are using C++ on Linux, the C++ threads /are/ pthreads.  These are 
the points I can think of for preferring C++ threads:

+ You have a nice class/template library with C++ threads, instead of a 
C function interface.

+ You have have RAII classes for locks and other synchronisation objects.

+ You have consistency with other C++ thread systems.

+ Your compiler may understand that your code is threaded.

- You need at least C++11 (but that has huge advantages anyway, compared 
to older C++).

- It is marginally more fiddly if you need the underlying thread details 
for features not supported by the C++ thread library.

Reply by ●July 20, 20182018-07-20

On Saturday, July 21, 2018 at 6:19:31 AM UTC+12, David Brown wrote:
> On 20/07/18 13:01, gp...@gmail.com wrote:
> > We're starting an embedded Linux C++ project with an ARM micro and using GCC V7.  Can anyone suggest pros and cons of using C++ Threads versus PThreads (Posix threads).
> > 
> 
> C++ threads are always a wrapper around an underlying library.  So if 
> you are using C++ on Linux, the C++ threads /are/ pthreads.  These are 
> the points I can think of for preferring C++ threads:
> 
> + You have a nice class/template library with C++ threads, instead of a 
> C function interface.
> 
> + You have have RAII classes for locks and other synchronisation objects.
> 
> + You have consistency with other C++ thread systems.
> 
> + Your compiler may understand that your code is threaded.
> 
> - You need at least C++11 (but that has huge advantages anyway, compared 
> to older C++).
> 
> - It is marginally more fiddly if you need the underlying thread details 
> for features not supported by the C++ thread library.

That's great, thanks.

Reply by Paul Rubin ●July 20, 20182018-07-20

graeme.prentice@gmail.com writes:
> That's great, thanks.

Also, ask yourself if you really need threads in the first place.
Depending on what you're doing, you may be better off with multiple
processes.  That gets rid of a lot of lock and race hazards, and if the
processes can communicate through sockets, that improves scalability by
making it easier for you to distribute your program across multiple
machines if you run out of cpu cores on your original machine.

Reply by ●July 21, 20182018-07-21

On Saturday, July 21, 2018 at 1:28:02 PM UTC+12, Paul Rubin wrote:
> 
> Also, ask yourself if you really need threads in the first place.
> Depending on what you're doing, you may be better off with multiple
> processes.  That gets rid of a lot of lock and race hazards, and if the
> processes can communicate through sockets, that improves scalability by
> making it easier for you to distribute your program across multiple
> machines if you run out of cpu cores on your original machine.

Thanks for the suggestion.  The micro is an ARM9 LPC3250 SOM (we're forced to use this at the moment) which I believe is single core (it's hard to find out for some reason) but it could easily change in future.  Based on a previous project, race conditions and deadlocks are a major headache so I'm hoping the core data will be written to by one thread only, maybe with lock-free queues.  The CPU data cache is 32KB and it's probably "write through".  We would have to do some performance tests to see if multi-processes and sockets is viable.

Reply by David Brown ●July 21, 20182018-07-21

On 21/07/18 08:58, graeme.prentice@gmail.com wrote:
> On Saturday, July 21, 2018 at 1:28:02 PM UTC+12, Paul Rubin wrote:
>>
>> Also, ask yourself if you really need threads in the first place.
>> Depending on what you're doing, you may be better off with multiple
>> processes.  

There are certainly tasks that are better handled as multiple processes 
rather than multiple threads.  (But note that it is not an either/or 
choice - often the best solution uses both.)

>> That gets rid of a lot of lock and race hazards, 

No, it does not - it merely changes them.  If your separate threads of 
execution need to synchronise, communicate, or agree about shared 
resources, then there is no theoretical difference about the types of 
hazards, races, or other such problems if you use multiple threads or 
multiple processes.  The details change, and the types of 
synchronisation objects use can change, but they do no not go away. 
Some may be handled by the OS rather than the application, however - for 
example, a pipe between processes will let you communicate without worry 
about locks for the underlying shared data structure, at the cost of 
being a lot less efficient than shared memory in threads.

Multiple processes have higher resource costs, and they make it a lot 
harder to use tools such as "-fsanitize=thread" to find problems.  On 
the other hand, they make it easier to break the problem down into 
separate tasks that are handled independently and tested independently. 
That helps if you have different developers - or even different 
programming languages.

>> and if the
>> processes can communicate through sockets, that improves scalability by
>> making it easier for you to distribute your program across multiple
>> machines if you run out of cpu cores on your original machine.

True.

This can also be useful during development when you might have some of 
the bits running on your target system, and other bits running on your 
host computer (perhaps under a debugger).

> 
> Thanks for the suggestion. The micro is an ARM9 LPC3250 SOM (we're
> forced to use this at the moment) which I believe is single core (it's
> hard to find out for some reason) but it could easily change in future.

That doesn't matter for the choice of threads, processes or both.

> Based on a previous project, race conditions and deadlocks are a major
> headache so I'm hoping the core data will be written to by one thread
> only, maybe with lock-free queues. The CPU data cache is 32KB and it's
> probably "write through". We would have to do some performance tests to
> see if multi-processes and sockets is viable.

Multiple processes are slower than multiple threads, and sockets are 
much slower than in-process queues.  But the sockets are more flexible. 
You might find you want an abstraction that can use either method as a 
backend, and change during different stages of development.

Reply by Tauno Voipio ●July 21, 20182018-07-21

On 21.7.18 09:58, graeme.prentice@gmail.com wrote:
> Thanks for the suggestion.  The micro is an ARM9 LPC3250 SOM (we're forced to use this at the moment) which I believe is single core (it's hard to find out for some reason) but it could easily change in future.  Based on a previous project, race conditions and deadlocks are a major headache so I'm hoping the core data will be written to by one thread only, maybe with lock-free queues.  The CPU data cache is 32KB and it's probably "write through".  We would have to do some performance tests to see if multi-processes and sockets is viable.



For number of cores on your system, if you have Linux running on the
target, have a look at /proc/cpuinfo:

tauno@pi2:~ $ cat /proc/cpuinfo
processor	: 0
model name	: ARMv7 Processor rev 5 (v7l)
BogoMIPS	: 38.40
Features	: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt 
vfpd32 lpae evtstrm
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xc07
CPU revision	: 5

processor	: 1
model name	: ARMv7 Processor rev 5 (v7l)
BogoMIPS	: 38.40
Features	: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt 
vfpd32 lpae evtstrm
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xc07
CPU revision	: 5

processor	: 2
model name	: ARMv7 Processor rev 5 (v7l)
BogoMIPS	: 38.40
Features	: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt 
vfpd32 lpae evtstrm
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xc07
CPU revision	: 5

processor	: 3
model name	: ARMv7 Processor rev 5 (v7l)
BogoMIPS	: 38.40
Features	: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt 
vfpd32 lpae evtstrm
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xc07
CPU revision	: 5

Hardware	: BCM2835
Revision	: a01041
Serial		: 0000000064d34ba1

---

The above is from a Raspberry Pi 2.

-- 

-TV

Reply by Les Cargill ●July 22, 20182018-07-22

gp.kiwi@gmail.com wrote:
> We're starting an embedded Linux C++ project with an ARM micro and
> using GCC V7.  Can anyone suggest pros and cons of using C++ Threads
> versus PThreads (Posix threads).
> 

I'd suggest thinking about a design for how you'd measure which works
in context for you. If I run pthreads on a big Linux machine it'll be 
different from running them in a VM. Similarly, it'll be different on
a RasPi 3 sized ARM computer.

-- 
Les Cargill

Reply by Rob Gaddi ●July 23, 20182018-07-23

On 07/20/2018 11:58 PM, graeme.prentice@gmail.com wrote:
> On Saturday, July 21, 2018 at 1:28:02 PM UTC+12, Paul Rubin wrote:
>>
>> Also, ask yourself if you really need threads in the first place.
>> Depending on what you're doing, you may be better off with multiple
>> processes.  That gets rid of a lot of lock and race hazards, and if the
>> processes can communicate through sockets, that improves scalability by
>> making it easier for you to distribute your program across multiple
>> machines if you run out of cpu cores on your original machine.
> 
> Thanks for the suggestion.  The micro is an ARM9 LPC3250 SOM (we're forced to use this at the moment) which I believe is single core (it's hard to find out for some reason) but it could easily change in future.  Based on a previous project, race conditions and deadlocks are a major headache so I'm hoping the core data will be written to by one thread only, maybe with lock-free queues.  The CPU data cache is 32KB and it's probably "write through".  We would have to do some performance tests to see if multi-processes and sockets is viable.
> 

3250 is single-core.  Happens to be a part we use a lot around here, 
although we always go bare-metal rather than running Linux.  Cache is 
programmable through the page-table as to whether it's write-through or not.

The reason you're having trouble determining much of this information is 
that NXP bought large chunks of that chip wholesale from ARM without 
anyone there actually understanding it.  So the NXP documentation is 
spotty and occasionally wrong (let me tell you of our I2C-based woes).

There's a document available directly from ARM, ARM DDI 0198E, that is 
specifically the ARM926EJ-S Technical Reference Manual.  Getting into 
the details on the 3250 is nearly impossible without it.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Reply by mac ●July 26, 20182018-07-26

You can also communicate among processes through shared memory (e.g. mmap).

To look at a other way, processes require explicit sharing, threads share
implicitly.

On an embedded system, the heavier cost of process switching may be
important.

Previous12 3 4 5 Next

C++ threads versus PThreads for embedded Linux on ARM micro

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group