EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Embedded Linux: share data among different processes

Started by pozz June 23, 2016
pozz wrote:
> I'm new to embedded Linux so this question could be very simple for many > of you. Most probably, it isn't directly related to embedded world, but > to Linux OS generally. Anyway I think it is a common scenario in > embedded applications. > > I'm going to develop a local/remote control of an electronic device. It > communicates through a RS485 link. > The local control will be a touch-screen display (I'm going to use QT > graphic libraries). > The remote control will be HTTP (web server). > > I think a good approach will be to develop a simple application, the > poller, that communicates with the electronic device and implements the > RS485 protocol. The poller continuously acquires the current > status/settings of the device and store them in some "shared" way. > > The graphic application (QT-based) and the web server (CGI) should > access to the data retrieved by the poller. > > What is the best method to share the data generated by an application > (the poller) among two or more applications (QT and CGI)? > In this scenario, I think it's important to lock the "shared data" > before accessing them (reading or writing), in order to avoid reading > incoerent data. Indeed, if the poller writes the data at the same time > (Linux OS is multi-tasking) the web server reads them, they could be > incoerent. > > I'm thinking to use SQLite database to store the data. The poller writes > the database, HTTP and QT reads from it. It seems SQLite will take care > the multi-thread/multi-process scenario. > > Any suggestions?
I also stumbled onto a Go implementation that looks interesting. This for an old scale with an RS232 port a guy found in a junk pile. https://gist.github.com/sielickin/8cc79f0cb6a4b4c229b9786dffcabdbe Go has a whole lot of web furniture built in. No clue how you'd add the Qt program to that, however ( I'd guess a socket server ). Dunno if Go can easily be embedded or not; but with RasPi and BB Black these days.... -- Les Cargill
Paul Rubin wrote:
> pozz <pozzugno@gmail.com> writes: >> The status data that changes frequently will be maximum 100-200 bytes >> (I think it will be less than 100 bytes). > > If "frequently" means less than a few times per second and you have > enough hardware resources, you'll probably have an easier time writing > this as if it were a desktop or web app than using a traditional > embedded approach. That can even including writing the host part in a > server-side scripting language instead of something like C. >
Yeah - I put up a link to a Go solution that looks great. So many languages, so little time... -- Les Cargill
On 6/24/2016 2:39 PM, Les Cargill wrote:
> I also stumbled onto a Go implementation that looks interesting. > This for an old scale with an RS232 port a guy found in a junk pile. > > https://gist.github.com/sielickin/8cc79f0cb6a4b4c229b9786dffcabdbe > > Go has a whole lot of web furniture built in. No clue how you'd add the Qt > program to that, however ( I'd guess a socket server ). > > Dunno if Go can easily be embedded or not; but with RasPi and > BB Black these days....
Inferno can easily do this (esp for applications that aren't very "demanding"). There's a nominal HTTPd included in the (free) distribution. Inferno (through its Limbo language) also supports communication channels as first-class objects. E.g., the OP's "poller" can just pass specific typed objects down a channel (like a pipe -- except the endpoints can be on different CPUs, etc.) connecting it to the "user interface/web server" task. For endpoints on the same host (i.e., you don't have to worry about handling "remote host not available" conditions), the code is trivial. One delightful advantage of the distributed capabilities of Inferno is that you can run hosted inferno on a PC and interact with the application "naturally" without any artificial IDE's. I.e., you could simulate the poller by writing a Limbo app that accepts keyboard input and passes it to the "user interface" task -- running on your target hardware -- without making any significant changes to the system as a whole. If you can pick hardware for which a *native* Inferno port is available (e.g., rpi), then your total footprint can be under ~1MB for the OS *and* your application. Limbo apps are amazingly small; and, its close enough to C that it isn't terribly intimidating though lack of support for pointers can be distressing; and, getting used to tuples takes a few *seconds* (at which point, you'll wonder how you ever lived without them!)
Don Y wrote:

> On 6/23/2016 9:39 PM, Reinhardt Behm wrote: >>> This application is not demanding enough to need a shared memory >>> solution, with the attendant need for locks and especially, >>> memory barriers, of which most respondents seem ill-educated. >>> In particular, Mel Wilson's "one writer, many reader" is no >>> longer so simple with modern multi-core architectures. Writes >>> do not have to happen in the order they are written, so readers >>> can get confused. >> >> Because of that I prefer to avoid all these locking if it is not needed. > > You're just hiding by wrapping it in an IPC mechanism.
He? What am I hiding? I proposed that the server (poller) collects the data from the device and sends _copies_ of it to the clients. This can happen in a single thread. So there is no need for any kind of locking. So there is nothing hidden behind an IPC mechanism. My proposed solution uses internally Unix domain sockets and could as well use TCP sockets which would give the flexibility to work over different machines. It is proven to work in a DO-178 certified system where several of these mechanisms are at work to convey data from GNSS, TCAS, AHARS, camera control, Iridium satcomm, each with its own server. And it works in both directions.
> >> Having the data at one place controlled by one thread and send it from >> there to interested clients via some IPC mechanism make live much easier. >> And it is also a very effective encapsulation. The other can not even >> know, how it handled internally in the server. > > Yes, but it is also considerably more expensive. Every consumer has to be > informed of every update to the data in which it has "expressed an > interest" (assuming you don't broadcast ALL data to ALL consumers).
For many application scenarios the throughput and overhead is of no concern. And yes, I broadcast all data to all clients.
> > E.g., I install triggers in my RDBMS so that any update to a particular > (set of) parameter(s) results in an up-call notification of the interested > parties that might want to *see* that new value (so, I don't have to > broadcast all updates of all data to all consumers).
I was never proposing to use a database system. -- Reinhardt
On 6/25/2016 6:34 PM, Reinhardt Behm wrote:
> Don Y wrote: > >> On 6/23/2016 9:39 PM, Reinhardt Behm wrote: >>>> This application is not demanding enough to need a shared memory >>>> solution, with the attendant need for locks and especially, >>>> memory barriers, of which most respondents seem ill-educated. >>>> In particular, Mel Wilson's "one writer, many reader" is no >>>> longer so simple with modern multi-core architectures. Writes >>>> do not have to happen in the order they are written, so readers >>>> can get confused. >>> >>> Because of that I prefer to avoid all these locking if it is not needed. >> >> You're just hiding by wrapping it in an IPC mechanism. > > He? What am I hiding?
The fact that only a single task can access the "original data". You've got a "lock" in the IPC. Your *one* task can't compete with itself WHILE it is busy in the IPC. And, none of the consumers can see anything until the IPC completes (the receive() is atomic).
> I proposed that the server (poller) collects the data from the device and > sends _copies_ of it to the clients. This can happen in a single thread. So > there is no need for any kind of locking. So there is nothing hidden behind > an IPC mechanism.
See above. You could add multiple "server threads" and still hide the contention from your clients -- assuming no two server threads tried to update the same client with the same/different data.
> My proposed solution uses internally Unix domain sockets and could as well > use TCP sockets which would give the flexibility to work over different > machines. It is proven to work in a DO-178 certified system where several of > these mechanisms are at work to convey data from GNSS, TCAS, AHARS, camera > control, Iridium satcomm, each with its own server. And it works in both > directions.
I'm not saying it *won't* work. I'm just saying that you've implemented the locking functionality in a different manner. If the "poller" provided data at a faster rate than you could "broadcast" it, you'd lose data or have to introduce another server thread -- and then deal with contention on the server side of the IPC fence. Don't get me wrong; my entire OS is based on message-passing. Every function call is a message disguised as a function. So, making a kernel request uses the exact same mechanisms as sending data to another local process -- or, a *remote* process. But, there are (significant) costs associated with this. Crossing a protection boundary means kernel involvement. And, you're not just passing "payload" but, also, the overhead of the IPC/RPC "function", marshalling arguments, etc. [In my case, it's a net win because the target of an IPC can be virtually and physically relocated. So, a call that is an IPC *now* may be an RPC a few moments from now! The abstraction that passing messages (vs. sharing memory) provides is worth its cost. (I can also share memory across nodes but that leads to disappointment when you think in terms of LOCALLY shared memory -- only to discover that the memory has now migrated to another, remote node. And, accesses are SLOWER than passing messages would be!)]
>>> Having the data at one place controlled by one thread and send it from >>> there to interested clients via some IPC mechanism make live much easier. >>> And it is also a very effective encapsulation. The other can not even >>> know, how it handled internally in the server. >> >> Yes, but it is also considerably more expensive. Every consumer has to be >> informed of every update to the data in which it has "expressed an >> interest" (assuming you don't broadcast ALL data to ALL consumers). > > For many application scenarios the throughput and overhead is of no concern.
Of course! I'm moving to multicore nodes simply with the idea of dedicating a core to RPC. Silicon is cheap, nowadays.
> And yes, I broadcast all data to all clients.
Using broadcast protocols? Or, iterating over them?
>> E.g., I install triggers in my RDBMS so that any update to a particular >> (set of) parameter(s) results in an up-call notification of the interested >> parties that might want to *see* that new value (so, I don't have to >> broadcast all updates of all data to all consumers). > > I was never proposing to use a database system.
I was indicating how I *selectively* distribute data "to folks who have previously REGISTERED an interest". There are thousands of processes active in my system at any given time. Virtually *all* of them are unconcerned with any *particular* "data update", yet may have keen interest in SOME update at some time! So, there is a huge potential for waste if I unconditionally broadcast ALL updates to everyone.
On Fri, 24 Jun 2016 13:36:36 +0200, pozz wrote:

> Il 23/06/2016 15:47, Mel Wilson ha scritto: >> On Thu, 23 Jun 2016 13:52:21 +0200, pozz wrote: >>
[ ... ]
>>> I think a good approach will be to develop a simple application, the >>> poller, that communicates with the electronic device and implements >>> the RS485 protocol. The poller continuously acquires the current >>> status/settings of the device and store them in some "shared" way. >>> >>> The graphic application (QT-based) and the web server (CGI) should >>> access to the data retrieved by the poller. >>> >>> What is the best method to share the data generated by an application >>> (the poller) among two or more applications (QT and CGI)? >>> In this scenario, I think it's important to lock the "shared data" >>> before accessing them (reading or writing), in order to avoid reading >>> incoerent data. Indeed, if the poller writes the data at the same time >>> (Linux OS is multi-tasking) the web server reads them, they could be >>> incoerent. >>> >>> I'm thinking to use SQLite database to store the data. The poller >>> writes the database, HTTP and QT reads from it. It seems SQLite will >>> take care the multi-thread/multi-process scenario. >>> >>> Any suggestions? >> >> I did that with System V IPC, using message passing for control > > Yes, another problem to solve is the other way: QT and HTTP server could > send some control commands to the electronic device (switch on, switch > off, change this...). > > >> and shared memory blocks for info common to all the processes. I was >> able to avoid any locking using the one-writer/many-readers trick. > > It is exactly what I will have: one writer (the poller that will > continuously fetch updated data from the device) and some readers (at > least, HTTP server and QT graphic libraries). > Doesn't it needed a syncronization mechanism (semaphore) in this > scenario, using shared memory to share common data?
Just in case I should mention -- my system didn't contain just one Big Writer. Rather, different processes were responsible for different sections of the application, and each of those processes was the sole writer for the shared data concerned with its section. In this system, the Linux did the "cerebral" operations, and it communicated over a serial link with a "hind-brain" or "spinal column" run by a PIC32. There was a polling process that communicated with the PIC32 and dispatched input via IPC messages to the responsible processes. Some processes did quasi-real-time operations and, along with the polling process, ran at negative niceness for fastest response. Others could be given a twentieth of a second or so to respond and ran as normal processes. But each task was the sole writer of shared data that it "owned" and ordered any other changes through IPC messages. There were a bunch of Apache/CGI processes concerned with displaying diagnostic info remotely, and they never wrote shared data. Some other CGI that implemented remote testing and control did their thing by sending IPC messages to the responsible processes. Mel.
Il 24/06/2016 19:48, Don Y ha scritto:
> On 6/24/2016 4:43 AM, pozz wrote: >> Il 23/06/2016 21:31, Paul Rubin ha scritto: >>> pozz <pozzugno@gmail.com> writes: >>>> I'm thinking to use SQLite database to store the data. The poller >>>> writes the database, HTTP and QT reads from it. It seems SQLite will >>>> take care the multi-thread/multi-process scenario. >>> >>> This is the classic and probably easiest approach, though not the most >>> economical in terms of machine use. There are other databases besides >>> sqlite that you can also consider. How much data are you talking about? >> >> The overall configuration of the device could be maximum 1kB (I think >> it will >> be around 100 bytes). However it doesn't change frequently, so it can be >> retrieved only at startup and only when it really changes. >> >> The status data that changes frequently will be maximum 100-200 bytes >> (I think >> it will be less than 100 bytes). > > Then, why don't you just design an event-driven interface and move the data > across the protection barrier when these "update events" occur? Or, > better, > when they occur AND are "of interest"? > > The "client" can keep a set of "most recently endorsed parameters" > and an "empty" buffer into which any *new* parameters that are in transit > from the "server" are accumulated. Once they have "arrived", call > *that* buffer the authoritative reference and treat the previous > reference as a "disposable buffer" for the next update.
Of course this is another approach, the opposite way. The "poller"/"server" pushes new data to the "clients" when they are available, about every 1 second. However I don't know if this is the best approach. In this case, the "server" should know exactly how many "clients" are, open a different "communication channel" (a pipe) with them and continuously (even if at a low frequency) transmits data on that channel. Today the clients are two (QT local graphical interface and remote HTTP interface), but tomorrow? If I will need to add a new "client" (local or remote) I should touch the "poller"/"server" process. I think the opposite approach is better. The "poller"/"server" has only one goal: retrieve updated parameters and store them in a share-friendly way to whatever "client" needs them. I only need to choose the best sharing method between "poller"/"server" and "clients". Usually "clients" will access shared data for reading, but occasionally they will need to send new parameters (through graphica touch-screen display or remote web pages interface) to the device. This is another problem to solve...
>>> Do you need stuff like persistence across reboots? >> >> No. > >
Il 24/06/2016 22:52, Paul Rubin ha scritto:
> pozz <pozzugno@gmail.com> writes: >> The status data that changes frequently will be maximum 100-200 bytes >> (I think it will be less than 100 bytes). > > If "frequently" means less than a few times per second and you have > enough hardware resources, you'll probably have an easier time writing > this as if it were a desktop or web app than using a traditional > embedded approach.
I'm not an expert of desktop or web app... some starting points?
> That can even including writing the host part in a > server-side scripting language instead of something like C. >
On 6/27/2016 5:46 AM, pozz wrote:
> Il 24/06/2016 19:48, Don Y ha scritto: >> On 6/24/2016 4:43 AM, pozz wrote: >>> Il 23/06/2016 21:31, Paul Rubin ha scritto: >>>> pozz <pozzugno@gmail.com> writes: >>>>> I'm thinking to use SQLite database to store the data. The poller >>>>> writes the database, HTTP and QT reads from it. It seems SQLite will >>>>> take care the multi-thread/multi-process scenario. >>>> >>>> This is the classic and probably easiest approach, though not the most >>>> economical in terms of machine use. There are other databases besides >>>> sqlite that you can also consider. How much data are you talking about? >>> >>> The overall configuration of the device could be maximum 1kB (I think >>> it will >>> be around 100 bytes). However it doesn't change frequently, so it can be >>> retrieved only at startup and only when it really changes. >>> >>> The status data that changes frequently will be maximum 100-200 bytes >>> (I think it will be less than 100 bytes). >> >> Then, why don't you just design an event-driven interface and move the data >> across the protection barrier when these "update events" occur? Or, >> better, >> when they occur AND are "of interest"? >> >> The "client" can keep a set of "most recently endorsed parameters" >> and an "empty" buffer into which any *new* parameters that are in transit >> from the "server" are accumulated. Once they have "arrived", call >> *that* buffer the authoritative reference and treat the previous >> reference as a "disposable buffer" for the next update. > > Of course this is another approach, the opposite way. > The "poller"/"server" pushes new data to the "clients" when they are available, > about every 1 second.
That's what I was saying, above. The client (i.e., a thread acting for the client) opens a socket, blocks on a FIFO, etc. waiting for an "update" to appear on his doorstep. Meanwhile, the "real" client (thread) behaves normally, using the LAST set of data (the COMPLETE SET) that it had received. When a new set arrives, it arrives in a coherent *block* -- an entire set of data (not individual parameters). WHEN IT IS CONVENIENT FOR THE CLIENT -- and, whomever the client serves (i.e., the user) -- the client swaps the new set of data to replace the old set. In an indivisible/atomic act. This allows you some better control over how you implement that "sharing" on the client side. E.g., you can "pause" the web server, grab the harvested data, update the dataset that the web server consults, then resume the web server. You just have to make sure you don't allow the "harvesting" thread to update that dataset (with yet another!) while you are using it to update the web server's copy! [It is conceivable that the "harvester" thread may already be waiting for yet another "update" -- as there's no guarantee that the "client" will use this most recent data set before another set becomes available!] Consider the (Qt) user's viewpoint of the data set. If he is engaged in an HTTP "session" with your device, he may (or may not) want to see the most recent data exposed to him "asynchronously". E.g., he may be changing the previous set AS DISPLAYED TO HIM and arranging for all of the paraameters to be consistent with each other; he wouldn't want you to alter some PORTION of those -- even if that update (that he hasn't yet seen) is "more current". The "opposite way" is for clients to *pull* data across; to effectively say "give me X" (and X might be "the entire data set"). A *true* "server" implementation could be sitting with open socket(s) waiting for these IPC requests and *immediately* responding with the most recent (or "most consistent"?) version of the sought data -- which could be "all". This allows the client to *pull* the data AT A TIME WHEN IT IS CONVENIENT FOR IT! But, it pushes the "sharing" issue into the server. [If the server (poller) isn't ready and waiting, then you risk the case where the server hangs waiting for NEW incoming "remote" data and the web server (the poller's client) also ends up waiting!] Note that you have to be careful in either implementation. E.g., something like this set of threads: Producer() ... acquireMutex(NAME) ... write(FIFO,...) ... releaseMutex(NAME) ... Consumer() ... acquireMutex(NAME) ... read(FIFO) ... releaseMutex(NAME) ... Other() ... acquireMutex(NAME) ... releaseMutex(NAME) ... can get into deadlock -- regardless of the cleverness of the scheduling algorithm! (the FIFO operations being any sort of IPC; mutex a sort of semaphore)
> However I don't know if this is the best approach. In this case, the "server" > should know exactly how many "clients" are, open a different "communication > channel" (a pipe) with them and continuously (even if at a low frequency) > transmits data on that channel.
The easier way, IMO, is to let clients "express an interest" -- instead of forcing the producer to know who all of the clients are at any given moment in time. Shared memory implicitly does this (only clients INTERESTED in the data will poke around in the shared memory! The producer doesn't care how many of them there might be) Likewise, clients opening connections (IPC's) to the "poller" explicitly tell it who its consumers are. A client uninterested in its data will simply NOT open such a connection.
> Today the clients are two (QT local graphical interface and remote HTTP > interface), but tomorrow? If I will need to add a new "client" (local or > remote) I should touch the "poller"/"server" process.
IMO, you should *not* have to touch the poller! It's job hasn't changed!
> I think the opposite approach is better. The "poller"/"server" has only one > goal: retrieve updated parameters and store them in a share-friendly way to > whatever "client" needs them. > I only need to choose the best sharing method between "poller"/"server" and > "clients".
You still have a sharing/locking problem in the client -- or in the server. There are different versions of the data "live" in your system at any given time: - the data that the Qt GUI is using - the data that the HTTPd interface is exporting - the data that the poller is CURRENTLY acquiring (from the CAN? nodes) - the most recent *set* of data that the poller had ALREADY acquired - etc. So, you have lots of opportunities for the system to be in an inconsistent state. E.g., imagine (for want of a better example as I have no idea of the actual nature of the data that you are processing) that one of your parameters is "today's date" (fetched from a remote node via your CAN? interface). The HTTPd client (web server) may have one notion of today's date -- because it has deliberately NOT updated this in case the web user might be in the process of updating the current *time* and would be disturbed to find the date having changed while he was doing that. The Qt GUI user can be in a similar situation -- maybe the Qt session began before the HTTPd session and the date had a different value, at that time! It was changed (by something) AFTER the GUI session started so the GUI is technically showing stale data. But, the HTTPd session, started after that date-change, is showing a "more recent" version (which might *also* be stale!) The poller has the "last date" successfully retrieved (from that remote "clock" node). But, happens to be retrieving an updated date, at this very moment! Plus, any "dates" stuck in pipes/IPC's awaiting delivery... Plus, any changes in the date that are happening IN the "remote node" (time marches on -- 23:59:00 becomes "tomorrow" in short order!)
> Usually "clients" will access shared data for reading, but occasionally they > will need to send new parameters (through graphica touch-screen display or > remote web pages interface) to the device. > This is another problem to solve...
Another way of addressing this is with a token passing implementation. The poller (?) builds a packet of data. Then, sends it to client#1. Client#1 extracts any values of interest and, possibly, modifies any parameters that it considers as important. Then, forwards the packet to client#2 -- who does the same sort of things. Eventually, the entire "circuit" is traversed. A second trip is made around the circuit in which each "participant" compares the newly received packet to a saved copy of the packet that *it* passed along previously. It extracts any parameters that differ to ensure its "copy" reflects this new packet that it will be propagating. When the second trip is complete, everyone is known to have the same data and they can each use it. [This is tricky to do and can be very brittle and nonperformant]
pozz <pozzugno@gmail.com> writes:
> I'm not an expert of desktop or web app... some starting points?
https://wiki.python.org/moin/BeginnersGuide maybe. Basically I mean: write in a scripting language, use high-level libraries or services, and (within reason) don't worry about resource consumption. What hardware are you using?

The 2024 Embedded Online Conference