EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Thread based software architecture vs Process based software architecture

Started by Karthik Balaguru September 21, 2014
Hi,
Have few queries on the best possible software architecture.

Processes are heavy weight and they appear to occupy more memory, more time to create/start, increased latency during context switches and separate memory space that necessitates heavy IPC mechanisms. Threads are light weight and share memory space. However, I realized that threads also enter into contention for resources/memory due to the shared resources among them that inturn becomes a kind of bottle neck for multi-threaded architecture but not for multiple process based architecture. Also the workaround for having thread local storage does not seem to be straight forward. This also makes me believe that maintaining multi-threaded application can be bit complex compared to that of multiple process architecture. Also that the performance of multi process architecture will be better due to separate memory space (This avoids locking or serialization of execution in case of multi process architecture) and this seems to take away the advantage of less context switch time in case of multi-threaded application !! Kindly let me know if this understanding is correct or correct with appropriate inputs.

I understand that the software architecture is mainly based on the type of application/requirement. Considering the development environment as Linux OS with C language on single core/multi-core processors, i would like to know for which type of applications should we need to go in for multi-threaded software architecture and for which type of applications should we need to go in for multiple process based software architecture ? Is there any matrix sheet  that maps/lists the type of requirements/applications and the possible software architecture for it ?

Thx in advans,
Karthik
On Sun, 21 Sep 2014 05:40:26 -0700, Karthik Balaguru wrote:

> Hi, > Have few queries on the best possible software architecture. > > Processes are heavy weight and they appear to occupy more memory, more > time to create/start, increased latency during context switches and > separate memory space that necessitates heavy IPC mechanisms. Threads > are light weight and share memory space. However, I realized that > threads also enter into contention for resources/memory due to the > shared resources among them that inturn becomes a kind of bottle neck > for multi-threaded architecture but not for multiple process based > architecture. Also the workaround for having thread local storage does > not seem to be straight forward. This also makes me believe that > maintaining multi-threaded application can be bit complex compared to > that of multiple process architecture. Also that the performance of > multi process architecture will be better due to separate memory space > (This avoids locking or serialization of execution in case of multi > process architecture) and this seems to take away the advantage of less > context switch time in case of multi-threaded application !! Kindly let > me know if this understanding is correct or correct with appropriate > inputs. > > I understand that the software architecture is mainly based on the type > of application/requirement. Considering the development environment as > Linux OS with C language on single core/multi-core processors, i would > like to know for which type of applications should we need to go in for > multi-threaded software architecture and for which type of applications > should we need to go in for multiple process based software architecture > ? Is there any matrix sheet that maps/lists the type of > requirements/applications and the possible software architecture for it > ?
You are mistaken in your notion that just because threads explicitly share a memory space and other resources that they have more contention. Threads share a memory _space_, but they don't have to use the same bits of memory within that space -- it is easy to set things up so that each thread has its own chunk of memory that it uses. In a sense, once you get past the MMU, processes share the same memory space, too -- it's just that the MMU protects each process from having to know about the memory space occupied by other processes, or even, for that matter, from having to know what physical addresses it occupies. The "processes have separate memory space" is an illusion, provided in hardware by the MMU. At the point where activity is going on in physical memory, all the processes have to access the same memory space, and so they contend for that resource. Ditto hard drive accesses, screen access, etc. Really, the biggest thing that you give up with threads vs. processes is that -- assuming the OS is doing its job -- processes are safe from one another. Threads, however, can easily stomp on one another, simply by writing into some part of memory that some other thread is using and thinks isn't going to be disturbed. For me, the dividing line between threads and processes is one of work load, processor loading, and trust: do I trust whoever is developing that software entity over there not to stomp on my stuff, and is it less work for both of us, using threads, to not stomp on each other's stuff than it is to just use processes? And can the job be done at all using processes? If the answer to the first two questions is "yes", then threads are indicated. If the answer to _either_ of the first two questions is "no", then processes are indicated -- and if the answer to the third question is then "no", the project is in jeopardy. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Hi Karthik,

On 9/21/2014 5:40 AM, Karthik Balaguru wrote:
> Hi, Have few queries on the best possible software architecture.
None -- without a clear definition of the application domain! :>
> Processes are heavy weight and they appear to occupy more memory, more time > to create/start, increased latency during context switches and separate > memory space that necessitates heavy IPC mechanisms. Threads are light > weight and share memory space.
The easiest way to think of the distinction is: threads are active entities (i.e., they are the "things" that "execute code"). Processes are containers that hold resources -- which can include (one or more) *threads*! I.e., a process is like its own little "machine" -- with its own memory, access priviledges, priorities (in the context of the "machine" in which it resides), etc. So, if the "system"/machine has certain shared resources (I/O devices, etc), it is the *process* that requests (by the actions of one of its threads) those resources and, eventually, gains ownership/access to it. (I.e., thread #1 in process A can request a resource and, when made available, thread #5 in process A can *use* that resource -- but none of the threads in process B can, at that time) Given that processes contain threads (in this conceptualization), you can see why it is "more expensive" to switch processes than it is to switch threads. You can also see why two threads in a process can compete to access a resource THAT THE PROCESS OWNS (either because it was explicitly requested from "the system" by "some thread" in that process; OR it was implicitly granted to that process when the process was instantiated: e.g., "shared memory" IN the process's address space). You are glossing over the potential case where two or more PROCESSES have to compete in "the system" for other "shared resources".
> However, I realized that threads also enter > into contention for resources/memory due to the shared resources among them > that inturn becomes a kind of bottle neck for multi-threaded architecture > but not for multiple process based architecture.
It's still a bottleneck. If two or more processes want to share some data, they either do so via "shared memory" (assuming the OS supports this between processes) -- which requires SOME form of "access/contention resolution" -- or by a more expensive solution (e.g., IPC/RPC). In each case, SOMETHING is handling the fact that contention can exist.
> Also the workaround for > having thread local storage does not seem to be straight forward. This also > makes me believe that maintaining multi-threaded application can be bit > complex compared to that of multiple process architecture.
There is no concept of a thread's "(private) memory space" -- though you can easily arrange for this (e.g., each thread has its own pushdown stack! anything thread #1 does that is implemented via the stack is effectively private -- though a rogue thread can still scribble on it!). By contrast, each (single-threaded) process has a unique, disjoint memory space "guaranteed" by the OS at the process's instantiation (I am assuming you have a "real/nonTOY OS").
> Also that the > performance of multi process architecture will be better due to separate > memory space (This avoids locking or serialization of execution in case of > multi process architecture) and this seems to take away the advantage of > less context switch time in case of multi-threaded application !!
No. If there is no contention, there is no locking required beyond what is implicitly present when "thread #1" is scheduled to execute while the other threads are (temporarily) blocked. Contention has costs, period. You can structure your code so that these costs are minimized. E.g., in a consumer/producer model of sharing, the two threads never actually compete for the same "object" -- an object that is being produced is invisible to a consumer waiting to consume it! Likewise, an object that has BEEN produced is no longer of interest to its producer! Process model gives an (incorrect) illusion of greater separation only because it "makes sharing (between PROCESSES) harder". If you similarly impose the restrictions that different process spaces impose on your code (i.e., never compete for data for which you have no NEED to compete -- as if it was NOT POSSIBLE), then the costs of sharing are the same -- none.
> Kindly let me know if this understanding is correct or correct with appropriate > inputs. > > I understand that the software architecture is mainly based on the type of > application/requirement. Considering the development environment as Linux OS > with C language on single core/multi-core processors, i would like to know > for which type of applications should we need to go in for multi-threaded > software architecture and for which type of applications should we need to > go in for multiple process based software architecture ? Is there any matrix > sheet that maps/lists the type of requirements/applications and the > possible software architecture for it ?
(sigh) *BIG* (complex) question. Essentially, you have to look at the benefits of "tightly coupled" execution (threads) vs. more "loosely coupled" (processes). And, the overhead involved in each sharing case. Likewise, the potential for (the illusion of) concurrency and the periods involved. E.g., any time an "execution context" (threaded or single-thread) has to block on <something> (resource, user, i/o, etc.), then there is an opportunity for some other execution context to "do meaningful work" (note that this is not the same thing as a GUARANTEE that they will be able to do meaningful work!). How often this occurs and the amount (percent?) of time that the blocked process is suspended -- relative to the rate at which "new work" arrives -- determines how much time you can afford to "waste" in the overhead of your model (thread v. process). E.g., if work is represented by cars arriving at a toll booth (your job being to monitor the presence of individual cars, the receipt of appropriate payment from each and the control of the "gate" allowing paid vehicles to pass), you could (all else being equal) create a single-threaded process that: wait for car; wait for payment; raise gate; lather, rinse, repeat And, spawn N instances of this process -- one for each "lane" at the toll booth (binding the appropriate instances of "car sensor", "coin counter", "gate actuator" to each instance). The "procedure" (I am trying to avoid using the word "process") is inherently serial -- easily handled by a single thread. [A car doesn't arrive at lane 4, pay at gate 7 and then exit at gate 2!] THE PROCESSES HAVE NOTHING TO SAY TO EACH OTHER! So, there is no contention *between* them. Most of the time, a process is waiting for (the next thing) to happen. I.e., while waiting for payment, it doesn't have to deal with "another car" -- even though another car *may* be arriving in some other lane! So, the cost of multiple processes (time) is largely hidden in that "wait time". You can, similarly, design this as a set of THREADS in the exact same way! Each thread has nothing to share with the other threads! [Keep this in mind as reading each of the following examples. "Thread" can often be replaced by "process"; but, you will have to think of everything else going on in the particular example to evaluate how (in)effective that solution might be!] Imagine, instead, writing this process as a set of threads: one that waits for the car; another that waits for payment; a third that raises the gate (and, presumably, ensures the car has passed successfully). These threads need to share information -- you don't want the gate_raiser thread to raise the gate before the payment_received thread has vouched for the vehicle's compliance! That shared information can be as simple as a shared "state" variable: {AWAITING_CAR, AWAITING_PAYMENT, RAISING_GATE}. Each thread can be responsible for monitoring the variable to determine when it is appropriate to "start" AND updating the variable when it has finished its assigned chore. I.e., only one thread is ever "holding" the variable (able to write to it!). [threads could also directly start/unblock each other in succession... lots of ways to skin this cat] You could likewise use a set of *processes* to do this: each process (pedantically, the single thread *in* each process) responsible for blocking on a particular condition, etc. But, processes cost more and are heavier-footed than threads. In the "process" implementation, the sharing has to happen through some OS-supported mechanism -- *if* processes are prohibited with accessing each other's (or *SYSTEM*!) resources. In the thread implementation, threads within a shared "container" can freely exchange information (relying on synchronization primitives provided by the OS *or* by constraints inherent in the algorithm: "YOU set this, I will CLEAR it") You could, also, have one giant "process" with lots of threads -- that handles the entire toll-booth. (again, lots of ways to skin this cat... I'll let you sort out the "more obvious" ones) Threads could sit "awaiting events". A set of "accepting payment" threads (responsible for verifying proper payment) can sit waiting for "CAR_ARRIVED" events (messages). When such an event is detected, the first WAITING/blocked thread consumes it and begins execution (the event obviously has to specify the lane on which the waiting car was detected!). [The next "accepting payment" thread -- IF ANY (possibly a configurable option... you might have fewer threads than lanes, etc. depends on the expected interarrival times of "cars") -- then steps up and awaits the NEXT "CAR_ARRIVED" event. This may be on the same lane as the immediately preceding event -- or, another lane entirely!] The "accepting payment" thread recently activated (above), now sits waiting for "coin received" events (from the specific lane that it is monitoring!). When it has processed enough of these to indicate proper payment, it generates a PAYMENT_RECEIVED event (tagged with the corresponding lane number) and then goes back to waiting for another "CAR_ARRIVED" event. [I.e., this flavor thread can only handle CAR_ARRIVED events!] Similarly, another (set of one or more) "raising gate" threads sit waiting for PAYMENT_RECEIVED events and act accordingly. Here, you need as many "raising gate" threads as there are gates that you want to be able to raise CONCURRENTLY! (e.g., if you don't mind letting other "paid customers" wait while you raise the gate for customer X, then you only need enough of those threads to raise *on* gate at a time! Yet another way of doing this is to have N copies of generic threads that are capable of processing *any* sort of event -- i.e., having a dispatch table (switch statement) at the start to route the event to the appropriate processing code fragment. In this case, you need only enough threads to handle the total number of "things" that can be happening at one time (i.e., one thing on each lane). Ah, but what is to prevent a cock-up in The System (or, an exploit by a savvy user?) from preventing the vehicle's initial arrival to be immediately followed by a PAYMENT_RECEIVED event? I.e., BEFORE the "accepting payment" thread has even been activated! (we have a technical term for this: "bug") In the initial "serial process", this wasn't possible: the code that was executed after payment was received COULDN'T run until a car had been detected AND coins counted. The design of the code precluded that possibility. To "exploit" the system, a user would have to synthesize all of the preceding events to "advance" the algorithm to the point where it was ready to lift the gate. OK, let's build a SHARED OBJECT that indicates the "state" of each of the lanes! That way, the "raising gate" thread won't invoke the actuator unless it sees all of the required prerequisites in place -- even if "signaled" by a PAYMENT_RECEIVED event! Now, you have several entities trying to update that state AT THE SAME TIME THAT OTHERS ARE TRYING TO EXAMINE IT. "Contention" that affects the entire application's performance -- ONE bottleneck (instead of a "bottleneck per lane" -- or NO bottlenecks!) Imagine if the cost of ATOMICLY accessing this object was a fat system call -- because it resided somewhere that all PROCESSES could access (contrast with THREADS)! In each case, you decide how much information you are sharing and who you are sharing it with. A single thread that runs a single lane from start to finish IMPLICITLY is sharing data with itself: it saw the car arrive on its assigned lane, it watched as the coins were deposited in the coin acceptor on that lane, then it raised the gate for that lane -- before returning to await the next arrival. As you split the "chore" into finer pieces -- or, split the handling of it into different/disjoint "execution contexts" -- you need to pass more information between those objects. E.g., passing events of the form (<lane>, <event_type>) to a set of generic "handlers" moves the sharing into the "event system". [whether this is a fifo, shared memory, IPC, etc.] OTOH, you increase the possibilities for concurrency and more efficient use of resources (why have N "raise gate" processes if drivers can afford to wait for THEIR gate to be lifted? Perhaps the gate lift mechanism can ONLY lift a single gate at a time (motor and gears/clutches). Sorry for the long-winded explanation. I will promptly be derided for it. But, hopefully it shows you different approaches (that exploit "potential parallelism/decomposition" in different ways) and the potential consequences of different approaches. You have to look at your workload and see what approach makes the most sense. Interconnections are expensive in any algorithm!
On 9/21/2014 12:23 PM, Tim Wescott wrote:
> On Sun, 21 Sep 2014 05:40:26 -0700, Karthik Balaguru wrote: > > You are mistaken in your notion that just because threads explicitly share > a memory space and other resources that they have more contention. > Threads share a memory _space_, but they don't have to use the same bits > of memory within that space -- it is easy to set things up so that each > thread has its own chunk of memory that it uses. > > In a sense, once you get past the MMU, processes share the same memory > space, too -- it's just that the MMU protects each process from having to > know about the memory space occupied by other processes, or even, for that > matter, from having to know what physical addresses it occupies.
MMU is not a requirement of a process model. Nor excluded from a thread model (even in a single process system). There is no guarantee that processes are protected from each other. That is an implementation detail. I.e., you can adopt a "process model" and have everyone living in a single, unified, FLAT memory space. The better way of thinking of processes is as resource containers (threads being one sort of resource). As such, they are bigger/heavier than threads -- that only have to remember their current processor state. E.g., a *process* can hold a resource. A thread can not. (the process CONTAINING the thread holds it). So, a process handling your "console" can hold that console (hardware/software construct) and one thread in it can paint the screen while another thread is responsible for "ringing the bell" (which takes a sizable fraction of a second!)
> The "processes have separate memory space" is an illusion, provided in > hardware by the MMU. At the point where activity is going on in physical > memory, all the processes have to access the same memory space, and so > they contend for that resource. Ditto hard drive accesses, screen access, > etc.
In most modern OS's (I won't use the L-word!), an MMU enforces a separation (partitioning) of this "hardware" address space. *If* the processor has such hardware available. (how "violations" are handled is another subject)
> Really, the biggest thing that you give up with threads vs. processes is > that -- assuming the OS is doing its job -- processes are safe from one > another. Threads, however, can easily stomp on one another, simply by > writing into some part of memory that some other thread is using and > thinks isn't going to be disturbed.
No, that isn't guaranteed -- unless you speak of a specific *port* of a specific OS (think of OS's that claim to run on hardware WITHOUT MMU's! The process model still applies -- you just lose the protections!)
> For me, the dividing line between threads and processes is one of work > load, processor loading, and trust: do I trust whoever is developing that > software entity over there not to stomp on my stuff, and is it less work > for both of us, using threads, to not stomp on each other's stuff than it > is to just use processes? And can the job be done at all using > processes? If the answer to the first two questions is "yes", then > threads are indicated. If the answer to _either_ of the first two > questions is "no", then processes are indicated -- and if the answer to > the third question is then "no", the project is in jeopardy.
For me, the criteria (if it has to be boiled down to a single one) is "communications". The more data that has to be "interactively" shared (or, the higher the frequency of sharing), the more of an annoyance process boundaries become. Because there are BOUNDARIES that must be crossed (at some cost) -- even if only in a conceptual sense (i.e., just because you don't have hardware protections doesn't mean scribbling in another process -- CONTAINER -- is "right")
On Sun, 21 Sep 2014 05:40:26 -0700 (PDT), Karthik Balaguru
<karthik.balaguru007@gmail.com> wrote:

>Processes are heavy weight and they appear to occupy more memory, >more time to create/start, increased latency during context >switches and separate memory space that necessitates heavy IPC >mechanisms. Threads are light weight and share memory space.
Yes and no. The best way to think of a process is as a resource container - a thread is a particular kind of resource (a computation resource) that a process can contain. Processes also typically are protection boundaries whereas threads typically are not [though there are exceptions].
>Considering the development environment as Linux OS with C >language on single core/multi-core processors, i would like to >know for which type of applications should we need to go in >for multi-threaded software architecture and for which type >of applications should we need to go in for multiple process >based software architecture ?
In Linux there is a system call "clone" (see clone(2)). Clone essentially creates new threads, but it permits detaching a new thread into a separate process and specifying with relatively fine control what parent resources should be copied to the child. Clone wraps an even lower level call that provides even more control over the environment of the new thread. Using clone you can create very lightweight processes, e.g., just a thread with MMU protection.
>Karthik
George
Hi Don,

Thanks for your quick reply !
That was indeed a pretty long & an interesting explanation !!

Karthik



On Monday, 22 September 2014 02:34:45 UTC+5:30, Don Y  wrote:
> Hi Karthik, > > > > On 9/21/2014 5:40 AM, Karthik Balaguru wrote: > > > Hi, Have few queries on the best possible software architecture. > > > > None -- without a clear definition of the application domain! :> > > > > > Processes are heavy weight and they appear to occupy more memory, more time > > > to create/start, increased latency during context switches and separate > > > memory space that necessitates heavy IPC mechanisms. Threads are light > > > weight and share memory space. > > > > The easiest way to think of the distinction is: threads are active entities > > (i.e., they are the "things" that "execute code"). Processes are containers > > that hold resources -- which can include (one or more) *threads*! > > > > I.e., a process is like its own little "machine" -- with its own memory, > > access priviledges, priorities (in the context of the "machine" in which it > > resides), etc. > > > > So, if the "system"/machine has certain shared resources (I/O devices, etc), > > it is the *process* that requests (by the actions of one of its threads) > > those resources and, eventually, gains ownership/access to it. (I.e., thread > > #1 in process A can request a resource and, when made available, thread #5 > > in process A can *use* that resource -- but none of the threads in process B > > can, at that time) > > > > Given that processes contain threads (in this conceptualization), you can see > > why it is "more expensive" to switch processes than it is to switch threads. > > > > You can also see why two threads in a process can compete to access a resource > > THAT THE PROCESS OWNS (either because it was explicitly requested from "the > > system" by "some thread" in that process; OR it was implicitly granted to that > > process when the process was instantiated: e.g., "shared memory" IN the > > process's address space). You are glossing over the potential case where > > two or more PROCESSES have to compete in "the system" for other "shared > > resources". > > > > > However, I realized that threads also enter > > > into contention for resources/memory due to the shared resources among them > > > that inturn becomes a kind of bottle neck for multi-threaded architecture > > > but not for multiple process based architecture. > > > > It's still a bottleneck. If two or more processes want to share some data, > > they either do so via "shared memory" (assuming the OS supports this between > > processes) -- which requires SOME form of "access/contention resolution" -- or > > by a more expensive solution (e.g., IPC/RPC). In each case, SOMETHING is > > handling the fact that contention can exist. > > > > > Also the workaround for > > > having thread local storage does not seem to be straight forward. This also > > > makes me believe that maintaining multi-threaded application can be bit > > > complex compared to that of multiple process architecture. > > > > There is no concept of a thread's "(private) memory space" -- though you > > can easily arrange for this (e.g., each thread has its own pushdown stack! > > anything thread #1 does that is implemented via the stack is effectively > > private -- though a rogue thread can still scribble on it!). > > > > By contrast, each (single-threaded) process has a unique, disjoint memory > > space "guaranteed" by the OS at the process's instantiation (I am assuming > > you have a "real/nonTOY OS"). > > > > > Also that the > > > performance of multi process architecture will be better due to separate > > > memory space (This avoids locking or serialization of execution in case of > > > multi process architecture) and this seems to take away the advantage of > > > less context switch time in case of multi-threaded application !! > > > > No. If there is no contention, there is no locking required beyond what > > is implicitly present when "thread #1" is scheduled to execute while the > > other threads are (temporarily) blocked. > > > > Contention has costs, period. You can structure your code so that these > > costs are minimized. E.g., in a consumer/producer model of sharing, > > the two threads never actually compete for the same "object" -- an > > object that is being produced is invisible to a consumer waiting to > > consume it! Likewise, an object that has BEEN produced is no longer > > of interest to its producer! > > > > Process model gives an (incorrect) illusion of greater separation > > only because it "makes sharing (between PROCESSES) harder". If > > you similarly impose the restrictions that different process spaces > > impose on your code (i.e., never compete for data for which you have > > no NEED to compete -- as if it was NOT POSSIBLE), then the costs > > of sharing are the same -- none. > > > > > Kindly let me know if this understanding is correct or correct with appropriate > > > inputs. > > > > > > I understand that the software architecture is mainly based on the type of > > > application/requirement. Considering the development environment as Linux OS > > > with C language on single core/multi-core processors, i would like to know > > > for which type of applications should we need to go in for multi-threaded > > > software architecture and for which type of applications should we need to > > > go in for multiple process based software architecture ? Is there any matrix > > > sheet that maps/lists the type of requirements/applications and the > > > possible software architecture for it ? > > > > (sigh) *BIG* (complex) question. Essentially, you have to look at the > > benefits of "tightly coupled" execution (threads) vs. more "loosely > > coupled" (processes). And, the overhead involved in each sharing case. > > Likewise, the potential for (the illusion of) concurrency and the > > periods involved. > > > > E.g., any time an "execution context" (threaded or single-thread) has to > > block on <something> (resource, user, i/o, etc.), then there is an > > opportunity for some other execution context to "do meaningful work" > > (note that this is not the same thing as a GUARANTEE that they will > > be able to do meaningful work!). > > > > How often this occurs and the amount (percent?) of time that the > > blocked process is suspended -- relative to the rate at which "new > > work" arrives -- determines how much time you can afford to "waste" > > in the overhead of your model (thread v. process). > > > > E.g., if work is represented by cars arriving at a toll booth > > (your job being to monitor the presence of individual cars, the > > receipt of appropriate payment from each and the control of the > > "gate" allowing paid vehicles to pass), you could (all else being > > equal) create a single-threaded process that: > > wait for car; > > wait for payment; > > raise gate; > > lather, rinse, repeat > > And, spawn N instances of this process -- one for each "lane" > > at the toll booth (binding the appropriate instances of "car > > sensor", "coin counter", "gate actuator" to each instance). > > The "procedure" (I am trying to avoid using the word "process") > > is inherently serial -- easily handled by a single thread. > > > > [A car doesn't arrive at lane 4, pay at gate 7 and then exit > > at gate 2!] > > > > THE PROCESSES HAVE NOTHING TO SAY TO EACH OTHER! So, there is > > no contention *between* them. > > > > Most of the time, a process is waiting for (the next thing) > > to happen. I.e., while waiting for payment, it doesn't have > > to deal with "another car" -- even though another car *may* > > be arriving in some other lane! So, the cost of multiple > > processes (time) is largely hidden in that "wait time". > > > > You can, similarly, design this as a set of THREADS in the > > exact same way! Each thread has nothing to share with the > > other threads! > > > > [Keep this in mind as reading each of the following examples. > > "Thread" can often be replaced by "process"; but, you will > > have to think of everything else going on in the particular > > example to evaluate how (in)effective that solution might be!] > > > > Imagine, instead, writing this process as a set of threads: > > one that waits for the car; another that waits for payment; > > a third that raises the gate (and, presumably, ensures the > > car has passed successfully). These threads need to > > share information -- you don't want the gate_raiser thread > > to raise the gate before the payment_received thread has > > vouched for the vehicle's compliance! > > > > That shared information can be as simple as a shared "state" > > variable: {AWAITING_CAR, AWAITING_PAYMENT, RAISING_GATE}. > > Each thread can be responsible for monitoring the variable > > to determine when it is appropriate to "start" AND updating > > the variable when it has finished its assigned chore. > > I.e., only one thread is ever "holding" the variable > > (able to write to it!). > > > > [threads could also directly start/unblock each other in > > succession... lots of ways to skin this cat] > > > > You could likewise use a set of *processes* to do this: > > each process (pedantically, the single thread *in* each > > process) responsible for blocking on a particular condition, > > etc. But, processes cost more and are heavier-footed than > > threads. > > > > In the "process" implementation, the sharing has to happen > > through some OS-supported mechanism -- *if* processes > > are prohibited with accessing each other's (or *SYSTEM*!) > > resources. In the thread implementation, threads within a > > shared "container" can freely exchange information > > (relying on synchronization primitives provided by the OS > > *or* by constraints inherent in the algorithm: "YOU set > > this, I will CLEAR it") > > > > You could, also, have one giant "process" with lots of > > threads -- that handles the entire toll-booth. (again, > > lots of ways to skin this cat... I'll let you sort out the > > "more obvious" ones) > > > > Threads could sit "awaiting events". A set of "accepting payment" > > threads (responsible for verifying proper payment) can sit > > waiting for "CAR_ARRIVED" events (messages). When such an event > > is detected, the first WAITING/blocked thread consumes it and > > begins execution (the event obviously has to specify the lane > > on which the waiting car was detected!). > > > > [The next "accepting payment" thread -- IF ANY (possibly a configurable > > option... you might have fewer threads than lanes, etc. depends on the > > expected interarrival times of "cars") -- then steps up and awaits > > the NEXT "CAR_ARRIVED" event. This may be on the same lane as the > > immediately preceding event -- or, another lane entirely!] > > > > The "accepting payment" thread recently activated (above), now > > sits waiting for "coin received" events (from the specific lane > > that it is monitoring!). When it has processed enough of these > > to indicate proper payment, it generates a PAYMENT_RECEIVED event > > (tagged with the corresponding lane number) and then goes back to > > waiting for another "CAR_ARRIVED" event. > > > > [I.e., this flavor thread can only handle CAR_ARRIVED events!] > > > > Similarly, another (set of one or more) "raising gate" threads > > sit waiting for PAYMENT_RECEIVED events and act accordingly. > > > > Here, you need as many "raising gate" threads as there are gates > > that you want to be able to raise CONCURRENTLY! (e.g., if you > > don't mind letting other "paid customers" wait while you raise > > the gate for customer X, then you only need enough of those > > threads to raise *on* gate at a time! > > > > Yet another way of doing this is to have N copies of generic threads > > that are capable of processing *any* sort of event -- i.e., having > > a dispatch table (switch statement) at the start to route the > > event to the appropriate processing code fragment. In this case, > > you need only enough threads to handle the total number of "things" > > that can be happening at one time (i.e., one thing on each lane). > > > > Ah, but what is to prevent a cock-up in The System (or, an exploit > > by a savvy user?) from preventing the vehicle's initial arrival > > to be immediately followed by a PAYMENT_RECEIVED event? I.e., > > BEFORE the "accepting payment" thread has even been activated! > > (we have a technical term for this: "bug") > > > > In the initial "serial process", this wasn't possible: the code > > that was executed after payment was received COULDN'T run until > > a car had been detected AND coins counted. The design of the > > code precluded that possibility. To "exploit" the system, > > a user would have to synthesize all of the preceding events > > to "advance" the algorithm to the point where it was ready to lift > > the gate. > > > > OK, let's build a SHARED OBJECT that indicates the "state" of each > > of the lanes! That way, the "raising gate" thread won't invoke > > the actuator unless it sees all of the required prerequisites in > > place -- even if "signaled" by a PAYMENT_RECEIVED event! > > > > Now, you have several entities trying to update that state AT THE > > SAME TIME THAT OTHERS ARE TRYING TO EXAMINE IT. "Contention" > > that affects the entire application's performance -- ONE bottleneck > > (instead of a "bottleneck per lane" -- or NO bottlenecks!) > > > > Imagine if the cost of ATOMICLY accessing this object was a fat > > system call -- because it resided somewhere that all PROCESSES > > could access (contrast with THREADS)! > > > > In each case, you decide how much information you are sharing and > > who you are sharing it with. A single thread that runs a single > > lane from start to finish IMPLICITLY is sharing data with itself: > > it saw the car arrive on its assigned lane, it watched as the > > coins were deposited in the coin acceptor on that lane, then it > > raised the gate for that lane -- before returning to await the > > next arrival. > > > > As you split the "chore" into finer pieces -- or, split the > > handling of it into different/disjoint "execution contexts" -- you > > need to pass more information between those objects. E.g., passing > > events of the form (<lane>, <event_type>) to a set of generic > > "handlers" moves the sharing into the "event system". > > > > [whether this is a fifo, shared memory, IPC, etc.] > > > > OTOH, you increase the possibilities for concurrency and more > > efficient use of resources (why have N "raise gate" processes > > if drivers can afford to wait for THEIR gate to be lifted? > > Perhaps the gate lift mechanism can ONLY lift a single gate at > > a time (motor and gears/clutches). > > > > Sorry for the long-winded explanation. I will promptly be derided > > for it. But, hopefully it shows you different approaches (that > > exploit "potential parallelism/decomposition" in different ways) > > and the potential consequences of different approaches. > > > > You have to look at your workload and see what approach makes the > > most sense. Interconnections are expensive in any algorithm!
Hi George,

Thanks for pointing it out. I agree that clone can be really handy in system architectures based on multiple threads running concurrently in shared memory space by controlling different levels of sharing between the parent and child tasks uses flags.

Karthik

On Tuesday, 23 September 2014 00:45:58 UTC+5:30, George Neuner  wrote:
> On Sun, 21 Sep 2014 05:40:26 -0700 (PDT), Karthik Balaguru > > <karthik.balaguru007@gmail.com> wrote: > > > > >Processes are heavy weight and they appear to occupy more memory, > > >more time to create/start, increased latency during context > > >switches and separate memory space that necessitates heavy IPC > > >mechanisms. Threads are light weight and share memory space. > > > > Yes and no. > > > > The best way to think of a process is as a resource container - a > > thread is a particular kind of resource (a computation resource) that > > a process can contain. > > > > Processes also typically are protection boundaries whereas threads > > typically are not [though there are exceptions]. > > > > > > >Considering the development environment as Linux OS with C > > >language on single core/multi-core processors, i would like to > > >know for which type of applications should we need to go in > > >for multi-threaded software architecture and for which type > > >of applications should we need to go in for multiple process > > >based software architecture ? > > > > In Linux there is a system call "clone" (see clone(2)). Clone > > essentially creates new threads, but it permits detaching a new thread > > into a separate process and specifying with relatively fine control > > what parent resources should be copied to the child. > > > > Clone wraps an even lower level call that provides even more control > > over the environment of the new thread. Using clone you can create > > very lightweight processes, e.g., just a thread with MMU protection. > > > > >Karthik > > George
Hi Karthik,

On 9/26/2014 8:11 AM, Karthik Balaguru wrote:

> Thanks for your quick reply ! > That was indeed a pretty long & an interesting explanation !!
The point is to show how a single application can be approached in a variety of different ways. And, within those different approaches, how the "sharing"/contention can manifest -- or not. Finally, the relative differences in costs between process vs. threaded implementations when it comes to that sharing. Figuring out how to approach YOUR problem (how to decompose it) will be your first step to determining the "most effective" implementation.
On Monday, 22 September 2014 00:53:32 UTC+5:30, Tim Wescott  wrote:
> On Sun, 21 Sep 2014 05:40:26 -0700, Karthik Balaguru wrote: > > > > > Hi, > > > Have few queries on the best possible software architecture. > > > > > > Processes are heavy weight and they appear to occupy more memory, more > > > time to create/start, increased latency during context switches and > > > separate memory space that necessitates heavy IPC mechanisms. Threads > > > are light weight and share memory space. However, I realized that > > > threads also enter into contention for resources/memory due to the > > > shared resources among them that inturn becomes a kind of bottle neck > > > for multi-threaded architecture but not for multiple process based > > > architecture. Also the workaround for having thread local storage does > > > not seem to be straight forward. This also makes me believe that > > > maintaining multi-threaded application can be bit complex compared to > > > that of multiple process architecture. Also that the performance of > > > multi process architecture will be better due to separate memory space > > > (This avoids locking or serialization of execution in case of multi > > > process architecture) and this seems to take away the advantage of less > > > context switch time in case of multi-threaded application !! Kindly let > > > me know if this understanding is correct or correct with appropriate > > > inputs. > > > > > > I understand that the software architecture is mainly based on the type > > > of application/requirement. Considering the development environment as > > > Linux OS with C language on single core/multi-core processors, i would > > > like to know for which type of applications should we need to go in for > > > multi-threaded software architecture and for which type of applications > > > should we need to go in for multiple process based software architecture > > > ? Is there any matrix sheet that maps/lists the type of > > > requirements/applications and the possible software architecture for it > > > ? > > > > You are mistaken in your notion that just because threads explicitly share > > a memory space and other resources that they have more contention. > > Threads share a memory _space_, but they don't have to use the same bits > > of memory within that space -- it is easy to set things up so that each > > thread has its own chunk of memory that it uses. > > > > In a sense, once you get past the MMU, processes share the same memory > > space, too -- it's just that the MMU protects each process from having to > > know about the memory space occupied by other processes, or even, for that > > matter, from having to know what physical addresses it occupies. > > > > The "processes have separate memory space" is an illusion, provided in > > hardware by the MMU. At the point where activity is going on in physical > > memory, all the processes have to access the same memory space, and so > > they contend for that resource. Ditto hard drive accesses, screen access, > > etc. > > > > Really, the biggest thing that you give up with threads vs. processes is > > that -- assuming the OS is doing its job -- processes are safe from one > > another. Threads, however, can easily stomp on one another, simply by > > writing into some part of memory that some other thread is using and > > thinks isn't going to be disturbed. > > > > For me, the dividing line between threads and processes is one of work > > load, processor loading, and trust: do I trust whoever is developing that > > software entity over there not to stomp on my stuff, and is it less work > > for both of us, using threads, to not stomp on each other's stuff than it > > is to just use processes? And can the job be done at all using > > processes? If the answer to the first two questions is "yes", then > > threads are indicated. If the answer to _either_ of the first two > > questions is "no", then processes are indicated -- and if the answer to > > the third question is then "no", the project is in jeopardy. >
Hi Tim, It is a very practical input. Also, the point of view based on the skillset of person in not stomping of another person's memory area is a really a good one to consider in any kind of project management and appears to be a real practical implementation check-point. Thanks, Karthik
On Sun, 21 Sep 2014 05:40:26 -0700 (PDT), Karthik Balaguru
<karthik.balaguru007@gmail.com> wrote:

>Hi, >Have few queries on the best possible software architecture. > >Processes are heavy weight and they appear to occupy more memory, more time to create/start, increased latency during context switches and separate memory space that necessitates heavy IPC mechanisms. Threads are light weight and share memory space. However, I realized that threads also enter into contention for resources/memory due to the shared resources among them that inturn becomes a kind of bottle neck for multi-threaded architecture but not for multiple process based architecture. Also the workaround for having thread local storage does not seem to be straight forward. This also makes me believe that maintaining multi-threaded application can be bit complex compared to that of multiple process architecture. Also that the performance of multi process architecture will be better due to separate memory space (This avoids locking or serialization of execution in case of multi process architecture) and this seems to take away the advantage of less context switch time in case of >multi-threaded application !! Kindly let me know if this understanding is correct or correct with appropriate inputs. > >I understand that the software architecture is mainly based on the type of application/requirement. Considering the development environment as Linux OS with C language on single core/multi-core processors, i would like to know for which type of applications should we need to go in for multi-threaded software architecture and for which type of applications should we need to go in for multiple process based software architecture ? Is there any matrix sheet that maps/lists the type of requirements/applications and the possible software architecture for it ? > >Thx in advans, >Karthik
Why wondering about processes vs. threads, use both as I have done for decades. I prefer keeping individual programs relatively small to help manageability, protection and updatable, with only a few threads within each address space. For larger systems with multiple processes and address spaces, just create some shared memory areas and map these areas into multiple process address spaces. For simple items (byte/word/dword) that can be accessed atomically, you don't need any synchronization, for complex items, process A moves xx megabytes of data to a shared memory area and then sends using some OS specific mechanism to process B "I just uploaded xx megabytes, go ahead". On modern virtual memory OSes (Linux/Windows) shared regions are implemented as memory mapped files (with or without backup to a real file). If the shared memory area is linked to a fixed virtual memory address, it must also fit into the same virtual address in each process and you can use pointers within that memory area directly. If the shared memory can be loaded at any virtual address in each process, pointers within each process in that shared area must be recalculated. For applications intended for a long (more than a decade) support, one must be careful how that shared memory is structured. Put a version number and pointers to key data structures into the absolute beginning of that shared area and in that way, process from different software versions can access the same data structure and the shared data area composition can be changed at will.

The 2024 Embedded Online Conference