Thread based software architecture vs Process based software architecture

Hi,
Have few queries on the best possible software architecture.

Processes are heavy weight and they appear to occupy more memory, more time to create/start, increased latency during context switches and separate memory space that necessitates heavy IPC mechanisms. Threads are light weight and share memory space. However, I realized that threads also enter into contention for resources/memory due to the shared resources among them that inturn becomes a kind of bottle neck for multi-threaded architecture but not for multiple process based architecture. Also the workaround for having thread local storage does not seem to be straight forward. This also makes me believe that maintaining multi-threaded application can be bit complex compared to that of multiple process architecture. Also that the performance of multi process architecture will be better due to separate memory space (This avoids locking or serialization of execution in case of multi process architecture) and this seems to take away the advantage of less context switch time in case of multi-threaded application !! Kindly let me know if this understanding is correct or correct with appropriate inputs.

I understand that the software architecture is mainly based on the type of application/requirement. Considering the development environment as Linux OS with C language on single core/multi-core processors, i would like to know for which type of applications should we need to go in for multi-threaded software architecture and for which type of applications should we need to go in for multiple process based software architecture ? Is there any matrix sheet that maps/lists the type of requirements/applications and the possible software architecture for it ?

Thx in advans,
Karthik

Reply by Tim Wescott ●September 21, 20142014-09-21

On Sun, 21 Sep 2014 05:40:26 -0700, Karthik Balaguru wrote:

> Hi,
> Have few queries on the best possible software architecture.
> 
> Processes are heavy weight and they appear to occupy more memory, more
> time to create/start, increased latency during context switches and
> separate memory space that necessitates heavy IPC mechanisms. Threads
> are light weight and share memory space. However, I realized that
> threads also enter into contention for resources/memory due to the
> shared resources among them that inturn becomes a kind of bottle neck
> for multi-threaded architecture but not for multiple process based
> architecture. Also the workaround for having thread local storage does
> not seem to be straight forward. This also makes me believe that
> maintaining multi-threaded application can be bit complex compared to
> that of multiple process architecture. Also that the performance of
> multi process architecture will be better due to separate memory space
> (This avoids locking or serialization of execution in case of multi
> process architecture) and this seems to take away the advantage of less
> context switch time in case of multi-threaded application !! Kindly let
> me know if this understanding is correct or correct with appropriate
> inputs.
> 
> I understand that the software architecture is mainly based on the type
> of application/requirement. Considering the development environment as
> Linux OS with C language on single core/multi-core processors, i would
> like to know for which type of applications should we need to go in for
> multi-threaded software architecture and for which type of applications
> should we need to go in for multiple process based software architecture
> ? Is there any matrix sheet  that maps/lists the type of
> requirements/applications and the possible software architecture for it
> ?

You are mistaken in your notion that just because threads explicitly share 
a memory space and other resources that they have more contention.  
Threads share a memory _space_, but they don't have to use the same bits 
of memory within that space -- it is easy to set things up so that each 
thread has its own chunk of memory that it uses.

In a sense, once you get past the MMU, processes share the same memory 
space, too -- it's just that the MMU protects each process from having to 
know about the memory space occupied by other processes, or even, for that 
matter, from having to know what physical addresses it occupies.

The "processes have separate memory space" is an illusion, provided in 
hardware by the MMU.  At the point where activity is going on in physical 
memory, all the processes have to access the same memory space, and so 
they contend for that resource.  Ditto hard drive accesses, screen access, 
etc.

Really, the biggest thing that you give up with threads vs. processes is 
that -- assuming the OS is doing its job -- processes are safe from one 
another.  Threads, however, can easily stomp on one another, simply by 
writing into some part of memory that some other thread is using and 
thinks isn't going to be disturbed.

For me, the dividing line between threads and processes is one of work 
load, processor loading, and trust: do I trust whoever is developing that 
software entity over there not to stomp on my stuff, and is it less work 
for both of us, using threads, to not stomp on each other's stuff than it 
is to just use processes?  And can the job be done at all using 
processes?  If the answer to the first two questions is "yes", then 
threads are indicated.  If the answer to _either_ of the first two 
questions is "no", then processes are indicated -- and if the answer to 
the third question is then "no", the project is in jeopardy.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Reply by Don Y ●September 21, 20142014-09-21

Hi Karthik,

On 9/21/2014 5:40 AM, Karthik Balaguru wrote:
> Hi, Have few queries on the best possible software architecture.

None -- without a clear definition of the application domain!  :>

> Processes are heavy weight and they appear to occupy more memory, more time
> to create/start, increased latency during context switches and separate
> memory space that necessitates heavy IPC mechanisms. Threads are light
> weight and share memory space.

The easiest way to think of the distinction is:  threads are active entities
(i.e., they are the "things" that "execute code").  Processes are containers
that hold resources -- which can include (one or more) *threads*!

I.e., a process is like its own little "machine" -- with its own memory,
access priviledges, priorities (in the context of the "machine" in which it
resides), etc.

So, if the "system"/machine has certain shared resources (I/O devices, etc),
it is the *process* that requests (by the actions of one of its threads)
those resources and, eventually, gains ownership/access to it.  (I.e., thread
#1 in process A can request a resource and, when made available, thread #5
in process A can *use* that resource -- but none of the threads in process B
can, at that time)

Given that processes contain threads (in this conceptualization), you can see
why it is "more expensive" to switch processes than it is to switch threads.

You can also see why two threads in a process can compete to access a resource
THAT THE PROCESS OWNS (either because it was explicitly requested from "the
system" by "some thread" in that process; OR it was implicitly granted to that
process when the process was instantiated:  e.g., "shared memory" IN the
process's address space).  You are glossing over the potential case where
two or more PROCESSES have to compete in "the system" for other "shared
resources".

> However, I realized that threads also enter
> into contention for resources/memory due to the shared resources among them
> that inturn becomes a kind of bottle neck for multi-threaded architecture
> but not for multiple process based architecture.

It's still a bottleneck.  If two or more processes want to share some data,
they either do so via "shared memory" (assuming the OS supports this between
processes) -- which requires SOME form of "access/contention resolution" -- or
by a more expensive solution (e.g., IPC/RPC).  In each case, SOMETHING is
handling the fact that contention can exist.

> Also the workaround for
> having thread local storage does not seem to be straight forward. This also
> makes me believe that maintaining multi-threaded application can be bit
> complex compared to that of multiple process architecture.

There is no concept of a thread's "(private) memory space" -- though you
can easily arrange for this (e.g., each thread has its own pushdown stack!
anything thread #1 does that is implemented via the stack is effectively
private -- though a rogue thread can still scribble on it!).

By contrast, each (single-threaded) process has a unique, disjoint memory
space "guaranteed" by the OS at the process's instantiation (I am assuming
you have a "real/nonTOY OS").

> Also that the
> performance of multi process architecture will be better due to separate
> memory space (This avoids locking or serialization of execution in case of
> multi process architecture) and this seems to take away the advantage of
> less context switch time in case of multi-threaded application !!

No.  If there is no contention, there is no locking required beyond what
is implicitly present when "thread #1" is scheduled to execute while the
other threads are (temporarily) blocked.

Contention has costs, period.  You can structure your code so that these
costs are minimized.  E.g., in a consumer/producer model of sharing,
the two threads never actually compete for the same "object" -- an
object that is being produced is invisible to a consumer waiting to
consume it!  Likewise, an object that has BEEN produced is no longer
of interest to its producer!

Process model gives an (incorrect) illusion of greater separation
only because it "makes sharing (between PROCESSES) harder".  If
you similarly impose the restrictions that different process spaces
impose on your code (i.e., never compete for data for which you have
no NEED to compete -- as if it was NOT POSSIBLE), then the costs
of sharing are the same -- none.

> Kindly let me know if this understanding is correct or correct with appropriate
> inputs.
>
> I understand that the software architecture is mainly based on the type of
> application/requirement. Considering the development environment as Linux OS
> with C language on single core/multi-core processors, i would like to know
> for which type of applications should we need to go in for multi-threaded
> software architecture and for which type of applications should we need to
> go in for multiple process based software architecture ? Is there any matrix
> sheet  that maps/lists the type of requirements/applications and the
> possible software architecture for it ?

(sigh)  *BIG* (complex) question.  Essentially, you have to look at the
benefits of "tightly coupled" execution (threads) vs. more "loosely
coupled" (processes).  And, the overhead involved in each sharing case.
Likewise, the potential for (the illusion of) concurrency and the
periods involved.

E.g., any time an "execution context" (threaded or single-thread) has to
block on <something> (resource, user, i/o, etc.), then there is an
opportunity for some other execution context to "do meaningful work"
(note that this is not the same thing as a GUARANTEE that they will
be able to do meaningful work!).

How often this occurs and the amount (percent?) of time that the
blocked process is suspended -- relative to the rate at which "new
work" arrives -- determines how much time you can afford to "waste"
in the overhead of your model (thread v. process).

E.g., if work is represented by cars arriving at a toll booth
(your job being to monitor the presence of individual cars, the
receipt of appropriate payment from each and the control of the
"gate" allowing paid vehicles to pass), you could (all else being
equal) create a single-threaded process that:
    wait for car;
    wait for payment;
    raise gate;
    lather, rinse, repeat
And, spawn N instances of this process -- one for each "lane"
at the toll booth (binding the appropriate instances of "car
sensor", "coin counter", "gate actuator" to each instance).
The "procedure" (I am trying to avoid using the word "process")
is inherently serial -- easily handled by a single thread.

[A car doesn't arrive at lane 4, pay at gate 7 and then exit
at gate 2!]

THE PROCESSES HAVE NOTHING TO SAY TO EACH OTHER!  So, there is
no contention *between* them.

Most of the time, a process is waiting for (the next thing)
to happen.  I.e., while waiting for payment, it doesn't have
to deal with "another car" -- even though another car *may*
be arriving in some other lane!  So, the cost of multiple
processes (time) is largely hidden in that "wait time".

You can, similarly, design this as a set of THREADS in the
exact same way!  Each thread has nothing to share with the
other threads!

[Keep this in mind as reading each of the following examples.
"Thread" can often be replaced by "process"; but, you will
have to think of everything else going on in the particular
example to evaluate how (in)effective that solution might be!]

Imagine, instead, writing this process as a set of threads:
one that waits for the car; another that waits for payment;
a third that raises the gate (and, presumably, ensures the
car has passed successfully).  These threads need to
share information -- you don't want the gate_raiser thread
to raise the gate before the payment_received thread has
vouched for the vehicle's compliance!

That shared information can be as simple as a shared "state"
variable:  {AWAITING_CAR, AWAITING_PAYMENT, RAISING_GATE}.
Each thread can be responsible for monitoring the variable
to determine when it is appropriate to "start" AND updating
the variable when it has finished its assigned chore.
I.e., only one thread is ever "holding" the variable
(able to write to it!).

[threads could also directly start/unblock each other in
succession... lots of ways to skin this cat]

You could likewise use a set of *processes* to do this:
each process (pedantically, the single thread *in* each
process) responsible for blocking on a particular condition,
etc.  But, processes cost more and are heavier-footed than
threads.

In the "process" implementation, the sharing has to happen
through some OS-supported mechanism -- *if* processes
are prohibited with accessing each other's (or *SYSTEM*!)
resources.  In the thread implementation, threads within a
shared "container" can freely exchange information
(relying on synchronization primitives provided by the OS
*or* by constraints inherent in the algorithm:  "YOU set
this, I will CLEAR it")

You could, also, have one giant "process" with lots of
threads -- that handles the entire toll-booth.  (again,
lots of ways to skin this cat... I'll let you sort out the
"more obvious" ones)

Threads could sit "awaiting events".  A set of "accepting payment"
threads (responsible for verifying proper payment) can sit
waiting for "CAR_ARRIVED" events (messages).  When such an event
is detected, the first WAITING/blocked thread consumes it and
begins execution (the event obviously has to specify the lane
on which the waiting car was detected!).

[The next "accepting payment" thread -- IF ANY (possibly a configurable
option... you might have fewer threads than lanes, etc.  depends on the
expected interarrival times of "cars") -- then steps up and awaits
the NEXT "CAR_ARRIVED" event.  This may be on the same lane as the
immediately preceding event -- or, another lane entirely!]

The "accepting payment" thread recently activated (above), now
sits waiting for "coin received" events (from the specific lane
that it is monitoring!).  When it has  processed enough of these
to indicate proper payment, it generates a PAYMENT_RECEIVED event
(tagged with the corresponding lane number) and then goes back to
waiting for another "CAR_ARRIVED" event.

[I.e., this flavor thread can only handle CAR_ARRIVED events!]

Similarly, another (set of one or more) "raising gate" threads
sit waiting for PAYMENT_RECEIVED events and act accordingly.

Here, you need as many "raising gate" threads as there are gates
that you want to be able to raise CONCURRENTLY!  (e.g., if you
don't mind letting other "paid customers" wait while you raise
the gate for customer X, then you only need enough of those
threads to raise *on* gate at a time!

Yet another way of doing this is to have N copies of generic threads
that are capable of processing *any* sort of event -- i.e., having
a dispatch table (switch statement) at the start to route the
event to the appropriate processing code fragment.  In this case,
you need only enough threads to handle the total number of "things"
that can be happening at one time (i.e., one thing on each lane).

Ah, but what is to prevent a cock-up in The System (or, an exploit
by a savvy user?) from preventing the vehicle's initial arrival
to be immediately followed by a PAYMENT_RECEIVED event?  I.e.,
BEFORE the "accepting payment" thread has even been activated!
(we have a technical term for this:  "bug")

In the initial "serial process", this wasn't possible:  the code
that was executed after payment was received COULDN'T run until
a car had been detected AND coins counted.  The design of the
code precluded that possibility.  To "exploit" the system,
a user would have to synthesize all of the preceding events
to "advance" the algorithm to the point where it was ready to lift
the gate.

OK, let's build a SHARED OBJECT that indicates the "state" of each
of the lanes!  That way, the "raising gate" thread won't invoke
the actuator unless it sees all of the required prerequisites in
place -- even if "signaled" by a PAYMENT_RECEIVED event!

Now, you have several entities trying to update that state AT THE
SAME TIME THAT OTHERS ARE TRYING TO EXAMINE IT.  "Contention"
that affects the entire application's performance -- ONE bottleneck
(instead of a "bottleneck per lane" -- or NO bottlenecks!)

Imagine if the cost of ATOMICLY accessing this object was a fat
system call -- because it resided somewhere that all PROCESSES
could access (contrast with THREADS)!

In each case, you decide how much information you are sharing and
who you are sharing it with.  A single thread that runs a single
lane from start to finish IMPLICITLY is sharing data with itself:
it saw the car arrive on its assigned lane, it watched as the
coins were deposited in the coin acceptor on that lane, then it
raised the gate for that lane -- before returning to await the
next arrival.

As you split the "chore" into finer pieces -- or, split the
handling of it into different/disjoint "execution contexts" -- you
need to pass more information between those objects.  E.g., passing
events of the form (<lane>, <event_type>) to a set of generic
"handlers" moves the sharing into the "event system".

[whether this is a fifo, shared memory, IPC, etc.]

OTOH, you increase the possibilities for concurrency and more
efficient use of resources (why have N "raise gate" processes
if drivers can afford to wait for THEIR gate to be lifted?
Perhaps the gate lift mechanism can ONLY lift a single gate at
a time (motor and gears/clutches).

Sorry for the long-winded explanation.  I will promptly be derided
for it.  But, hopefully it shows you different approaches (that
exploit "potential parallelism/decomposition" in different ways)
and the potential consequences of different approaches.

You have to look at your workload and see what approach makes the
most sense.  Interconnections are expensive in any algorithm!

Reply by Don Y ●September 21, 20142014-09-21

On 9/21/2014 12:23 PM, Tim Wescott wrote:
> On Sun, 21 Sep 2014 05:40:26 -0700, Karthik Balaguru wrote:
>
> You are mistaken in your notion that just because threads explicitly share
> a memory space and other resources that they have more contention.
> Threads share a memory _space_, but they don't have to use the same bits
> of memory within that space -- it is easy to set things up so that each
> thread has its own chunk of memory that it uses.
>
> In a sense, once you get past the MMU, processes share the same memory
> space, too -- it's just that the MMU protects each process from having to
> know about the memory space occupied by other processes, or even, for that
> matter, from having to know what physical addresses it occupies.

MMU is not a requirement of a process model.  Nor excluded from
a thread model (even in a single process system).

There is no guarantee that processes are protected from each other.
That is an implementation detail.  I.e., you can adopt a "process
model" and have everyone living in a single, unified, FLAT memory
space.

The better way of thinking of processes is as resource containers
(threads being one sort of resource).  As such, they are bigger/heavier
than threads -- that only have to remember their current processor state.

E.g., a *process* can hold a resource.  A thread can not.  (the process
CONTAINING the thread holds it).  So, a process handling your "console"
can hold that console (hardware/software construct) and one thread
in it can paint the screen while another thread is responsible for
"ringing the bell" (which takes a sizable fraction of a second!)

> The "processes have separate memory space" is an illusion, provided in
> hardware by the MMU.  At the point where activity is going on in physical
> memory, all the processes have to access the same memory space, and so
> they contend for that resource.  Ditto hard drive accesses, screen access,
> etc.

In most modern OS's (I won't use the L-word!), an MMU enforces a separation
(partitioning) of this "hardware" address space.  *If* the processor
has such hardware available.

(how "violations" are handled is another subject)

> Really, the biggest thing that you give up with threads vs. processes is
> that -- assuming the OS is doing its job -- processes are safe from one
> another.  Threads, however, can easily stomp on one another, simply by
> writing into some part of memory that some other thread is using and
> thinks isn't going to be disturbed.

No, that isn't guaranteed -- unless you speak of a specific *port*
of a specific OS (think of OS's that claim to run on hardware WITHOUT
MMU's!  The process model still applies -- you just lose the protections!)

> For me, the dividing line between threads and processes is one of work
> load, processor loading, and trust: do I trust whoever is developing that
> software entity over there not to stomp on my stuff, and is it less work
> for both of us, using threads, to not stomp on each other's stuff than it
> is to just use processes?  And can the job be done at all using
> processes?  If the answer to the first two questions is "yes", then
> threads are indicated.  If the answer to _either_ of the first two
> questions is "no", then processes are indicated -- and if the answer to
> the third question is then "no", the project is in jeopardy.

For me, the criteria (if it has to be boiled down to a single one)
is "communications".  The more data that has to be "interactively"
shared (or, the higher the frequency of sharing), the more of an
annoyance process boundaries become.  Because there are BOUNDARIES
that must be crossed (at some cost) -- even if only in a conceptual
sense (i.e., just because you don't have hardware protections doesn't
mean scribbling in another process -- CONTAINER -- is "right")

Reply by George Neuner ●September 22, 20142014-09-22

On Sun, 21 Sep 2014 05:40:26 -0700 (PDT), Karthik Balaguru
<karthik.balaguru007@gmail.com> wrote:

>Processes are heavy weight and they appear to occupy more memory,
>more time to create/start, increased latency during context 
>switches and separate memory space that necessitates heavy IPC 
>mechanisms. Threads are light weight and share memory space.

Yes and no.  

The best way to think of a process is as a resource container - a
thread is a particular kind of resource (a computation resource) that
a process can contain.

Processes also typically are protection boundaries whereas threads
typically are not [though there are exceptions].

>Considering the development environment as Linux OS with C 
>language on single core/multi-core processors, i would like to 
>know for which type of applications should we need to go in 
>for multi-threaded software architecture and for which type 
>of applications should we need to go in for multiple process
>based software architecture ?

In Linux there is a system call "clone" (see clone(2)).  Clone
essentially creates new threads, but it permits detaching a new thread
into a separate process and specifying with relatively fine control
what parent resources should be copied to the child.

Clone wraps an even lower level call that provides even more control
over the environment of the new thread.  Using clone you can create
very lightweight processes, e.g., just a thread with MMU protection.

>Karthik
George

Reply by Karthik Balaguru ●September 26, 20142014-09-26

Hi Don,

Thanks for your quick reply !
That was indeed a pretty long & an interesting explanation !!

Karthik



On Monday, 22 September 2014 02:34:45 UTC+5:30, Don Y  wrote:
> Hi Karthik,
> 
> 
> 
> On 9/21/2014 5:40 AM, Karthik Balaguru wrote:
> 
> > Hi, Have few queries on the best possible software architecture.
> 
> 
> 
> None -- without a clear definition of the application domain!  :>
> 
> 
> 
> > Processes are heavy weight and they appear to occupy more memory, more time
> 
> > to create/start, increased latency during context switches and separate
> 
> > memory space that necessitates heavy IPC mechanisms. Threads are light
> 
> > weight and share memory space.
> 
> 
> 
> The easiest way to think of the distinction is:  threads are active entities
> 
> (i.e., they are the "things" that "execute code").  Processes are containers
> 
> that hold resources -- which can include (one or more) *threads*!
> 
> 
> 
> I.e., a process is like its own little "machine" -- with its own memory,
> 
> access priviledges, priorities (in the context of the "machine" in which it
> 
> resides), etc.
> 
> 
> 
> So, if the "system"/machine has certain shared resources (I/O devices, etc),
> 
> it is the *process* that requests (by the actions of one of its threads)
> 
> those resources and, eventually, gains ownership/access to it.  (I.e., thread
> 
> #1 in process A can request a resource and, when made available, thread #5
> 
> in process A can *use* that resource -- but none of the threads in process B
> 
> can, at that time)
> 
> 
> 
> Given that processes contain threads (in this conceptualization), you can see
> 
> why it is "more expensive" to switch processes than it is to switch threads.
> 
> 
> 
> You can also see why two threads in a process can compete to access a resource
> 
> THAT THE PROCESS OWNS (either because it was explicitly requested from "the
> 
> system" by "some thread" in that process; OR it was implicitly granted to that
> 
> process when the process was instantiated:  e.g., "shared memory" IN the
> 
> process's address space).  You are glossing over the potential case where
> 
> two or more PROCESSES have to compete in "the system" for other "shared
> 
> resources".
> 
> 
> 
> > However, I realized that threads also enter
> 
> > into contention for resources/memory due to the shared resources among them
> 
> > that inturn becomes a kind of bottle neck for multi-threaded architecture
> 
> > but not for multiple process based architecture.
> 
> 
> 
> It's still a bottleneck.  If two or more processes want to share some data,
> 
> they either do so via "shared memory" (assuming the OS supports this between
> 
> processes) -- which requires SOME form of "access/contention resolution" -- or
> 
> by a more expensive solution (e.g., IPC/RPC).  In each case, SOMETHING is
> 
> handling the fact that contention can exist.
> 
> 
> 
> > Also the workaround for
> 
> > having thread local storage does not seem to be straight forward. This also
> 
> > makes me believe that maintaining multi-threaded application can be bit
> 
> > complex compared to that of multiple process architecture.
> 
> 
> 
> There is no concept of a thread's "(private) memory space" -- though you
> 
> can easily arrange for this (e.g., each thread has its own pushdown stack!
> 
> anything thread #1 does that is implemented via the stack is effectively
> 
> private -- though a rogue thread can still scribble on it!).
> 
> 
> 
> By contrast, each (single-threaded) process has a unique, disjoint memory
> 
> space "guaranteed" by the OS at the process's instantiation (I am assuming
> 
> you have a "real/nonTOY OS").
> 
> 
> 
> > Also that the
> 
> > performance of multi process architecture will be better due to separate
> 
> > memory space (This avoids locking or serialization of execution in case of
> 
> > multi process architecture) and this seems to take away the advantage of
> 
> > less context switch time in case of multi-threaded application !!
> 
> 
> 
> No.  If there is no contention, there is no locking required beyond what
> 
> is implicitly present when "thread #1" is scheduled to execute while the
> 
> other threads are (temporarily) blocked.
> 
> 
> 
> Contention has costs, period.  You can structure your code so that these
> 
> costs are minimized.  E.g., in a consumer/producer model of sharing,
> 
> the two threads never actually compete for the same "object" -- an
> 
> object that is being produced is invisible to a consumer waiting to
> 
> consume it!  Likewise, an object that has BEEN produced is no longer
> 
> of interest to its producer!
> 
> 
> 
> Process model gives an (incorrect) illusion of greater separation
> 
> only because it "makes sharing (between PROCESSES) harder".  If
> 
> you similarly impose the restrictions that different process spaces
> 
> impose on your code (i.e., never compete for data for which you have
> 
> no NEED to compete -- as if it was NOT POSSIBLE), then the costs
> 
> of sharing are the same -- none.
> 
> 
> 
> > Kindly let me know if this understanding is correct or correct with appropriate
> 
> > inputs.
> 
> >
> 
> > I understand that the software architecture is mainly based on the type of
> 
> > application/requirement. Considering the development environment as Linux OS
> 
> > with C language on single core/multi-core processors, i would like to know
> 
> > for which type of applications should we need to go in for multi-threaded
> 
> > software architecture and for which type of applications should we need to
> 
> > go in for multiple process based software architecture ? Is there any matrix
> 
> > sheet  that maps/lists the type of requirements/applications and the
> 
> > possible software architecture for it ?
> 
> 
> 
> (sigh)  *BIG* (complex) question.  Essentially, you have to look at the
> 
> benefits of "tightly coupled" execution (threads) vs. more "loosely
> 
> coupled" (processes).  And, the overhead involved in each sharing case.
> 
> Likewise, the potential for (the illusion of) concurrency and the
> 
> periods involved.
> 
> 
> 
> E.g., any time an "execution context" (threaded or single-thread) has to
> 
> block on <something> (resource, user, i/o, etc.), then there is an
> 
> opportunity for some other execution context to "do meaningful work"
> 
> (note that this is not the same thing as a GUARANTEE that they will
> 
> be able to do meaningful work!).
> 
> 
> 
> How often this occurs and the amount (percent?) of time that the
> 
> blocked process is suspended -- relative to the rate at which "new
> 
> work" arrives -- determines how much time you can afford to "waste"
> 
> in the overhead of your model (thread v. process).
> 
> 
> 
> E.g., if work is represented by cars arriving at a toll booth
> 
> (your job being to monitor the presence of individual cars, the
> 
> receipt of appropriate payment from each and the control of the
> 
> "gate" allowing paid vehicles to pass), you could (all else being
> 
> equal) create a single-threaded process that:
> 
>     wait for car;
> 
>     wait for payment;
> 
>     raise gate;
> 
>     lather, rinse, repeat
> 
> And, spawn N instances of this process -- one for each "lane"
> 
> at the toll booth (binding the appropriate instances of "car
> 
> sensor", "coin counter", "gate actuator" to each instance).
> 
> The "procedure" (I am trying to avoid using the word "process")
> 
> is inherently serial -- easily handled by a single thread.
> 
> 
> 
> [A car doesn't arrive at lane 4, pay at gate 7 and then exit
> 
> at gate 2!]
> 
> 
> 
> THE PROCESSES HAVE NOTHING TO SAY TO EACH OTHER!  So, there is
> 
> no contention *between* them.
> 
> 
> 
> Most of the time, a process is waiting for (the next thing)
> 
> to happen.  I.e., while waiting for payment, it doesn't have
> 
> to deal with "another car" -- even though another car *may*
> 
> be arriving in some other lane!  So, the cost of multiple
> 
> processes (time) is largely hidden in that "wait time".
> 
> 
> 
> You can, similarly, design this as a set of THREADS in the
> 
> exact same way!  Each thread has nothing to share with the
> 
> other threads!
> 
> 
> 
> [Keep this in mind as reading each of the following examples.
> 
> "Thread" can often be replaced by "process"; but, you will
> 
> have to think of everything else going on in the particular
> 
> example to evaluate how (in)effective that solution might be!]
> 
> 
> 
> Imagine, instead, writing this process as a set of threads:
> 
> one that waits for the car; another that waits for payment;
> 
> a third that raises the gate (and, presumably, ensures the
> 
> car has passed successfully).  These threads need to
> 
> share information -- you don't want the gate_raiser thread
> 
> to raise the gate before the payment_received thread has
> 
> vouched for the vehicle's compliance!
> 
> 
> 
> That shared information can be as simple as a shared "state"
> 
> variable:  {AWAITING_CAR, AWAITING_PAYMENT, RAISING_GATE}.
> 
> Each thread can be responsible for monitoring the variable
> 
> to determine when it is appropriate to "start" AND updating
> 
> the variable when it has finished its assigned chore.
> 
> I.e., only one thread is ever "holding" the variable
> 
> (able to write to it!).
> 
> 
> 
> [threads could also directly start/unblock each other in
> 
> succession... lots of ways to skin this cat]
> 
> 
> 
> You could likewise use a set of *processes* to do this:
> 
> each process (pedantically, the single thread *in* each
> 
> process) responsible for blocking on a particular condition,
> 
> etc.  But, processes cost more and are heavier-footed than
> 
> threads.
> 
> 
> 
> In the "process" implementation, the sharing has to happen
> 
> through some OS-supported mechanism -- *if* processes
> 
> are prohibited with accessing each other's (or *SYSTEM*!)
> 
> resources.  In the thread implementation, threads within a
> 
> shared "container" can freely exchange information
> 
> (relying on synchronization primitives provided by the OS
> 
> *or* by constraints inherent in the algorithm:  "YOU set
> 
> this, I will CLEAR it")
> 
> 
> 
> You could, also, have one giant "process" with lots of
> 
> threads -- that handles the entire toll-booth.  (again,
> 
> lots of ways to skin this cat... I'll let you sort out the
> 
> "more obvious" ones)
> 
> 
> 
> Threads could sit "awaiting events".  A set of "accepting payment"
> 
> threads (responsible for verifying proper payment) can sit
> 
> waiting for "CAR_ARRIVED" events (messages).  When such an event
> 
> is detected, the first WAITING/blocked thread consumes it and
> 
> begins execution (the event obviously has to specify the lane
> 
> on which the waiting car was detected!).
> 
> 
> 
> [The next "accepting payment" thread -- IF ANY (possibly a configurable
> 
> option... you might have fewer threads than lanes, etc.  depends on the
> 
> expected interarrival times of "cars") -- then steps up and awaits
> 
> the NEXT "CAR_ARRIVED" event.  This may be on the same lane as the
> 
> immediately preceding event -- or, another lane entirely!]
> 
> 
> 
> The "accepting payment" thread recently activated (above), now
> 
> sits waiting for "coin received" events (from the specific lane
> 
> that it is monitoring!).  When it has  processed enough of these
> 
> to indicate proper payment, it generates a PAYMENT_RECEIVED event
> 
> (tagged with the corresponding lane number) and then goes back to
> 
> waiting for another "CAR_ARRIVED" event.
> 
> 
> 
> [I.e., this flavor thread can only handle CAR_ARRIVED events!]
> 
> 
> 
> Similarly, another (set of one or more) "raising gate" threads
> 
> sit waiting for PAYMENT_RECEIVED events and act accordingly.
> 
> 
> 
> Here, you need as many "raising gate" threads as there are gates
> 
> that you want to be able to raise CONCURRENTLY!  (e.g., if you
> 
> don't mind letting other "paid customers" wait while you raise
> 
> the gate for customer X, then you only need enough of those
> 
> threads to raise *on* gate at a time!
> 
> 
> 
> Yet another way of doing this is to have N copies of generic threads
> 
> that are capable of processing *any* sort of event -- i.e., having
> 
> a dispatch table (switch statement) at the start to route the
> 
> event to the appropriate processing code fragment.  In this case,
> 
> you need only enough threads to handle the total number of "things"
> 
> that can be happening at one time (i.e., one thing on each lane).
> 
> 
> 
> Ah, but what is to prevent a cock-up in The System (or, an exploit
> 
> by a savvy user?) from preventing the vehicle's initial arrival
> 
> to be immediately followed by a PAYMENT_RECEIVED event?  I.e.,
> 
> BEFORE the "accepting payment" thread has even been activated!
> 
> (we have a technical term for this:  "bug")
> 
> 
> 
> In the initial "serial process", this wasn't possible:  the code
> 
> that was executed after payment was received COULDN'T run until
> 
> a car had been detected AND coins counted.  The design of the
> 
> code precluded that possibility.  To "exploit" the system,
> 
> a user would have to synthesize all of the preceding events
> 
> to "advance" the algorithm to the point where it was ready to lift
> 
> the gate.
> 
> 
> 
> OK, let's build a SHARED OBJECT that indicates the "state" of each
> 
> of the lanes!  That way, the "raising gate" thread won't invoke
> 
> the actuator unless it sees all of the required prerequisites in
> 
> place -- even if "signaled" by a PAYMENT_RECEIVED event!
> 
> 
> 
> Now, you have several entities trying to update that state AT THE
> 
> SAME TIME THAT OTHERS ARE TRYING TO EXAMINE IT.  "Contention"
> 
> that affects the entire application's performance -- ONE bottleneck
> 
> (instead of a "bottleneck per lane" -- or NO bottlenecks!)
> 
> 
> 
> Imagine if the cost of ATOMICLY accessing this object was a fat
> 
> system call -- because it resided somewhere that all PROCESSES
> 
> could access (contrast with THREADS)!
> 
> 
> 
> In each case, you decide how much information you are sharing and
> 
> who you are sharing it with.  A single thread that runs a single
> 
> lane from start to finish IMPLICITLY is sharing data with itself:
> 
> it saw the car arrive on its assigned lane, it watched as the
> 
> coins were deposited in the coin acceptor on that lane, then it
> 
> raised the gate for that lane -- before returning to await the
> 
> next arrival.
> 
> 
> 
> As you split the "chore" into finer pieces -- or, split the
> 
> handling of it into different/disjoint "execution contexts" -- you
> 
> need to pass more information between those objects.  E.g., passing
> 
> events of the form (<lane>, <event_type>) to a set of generic
> 
> "handlers" moves the sharing into the "event system".
> 
> 
> 
> [whether this is a fifo, shared memory, IPC, etc.]
> 
> 
> 
> OTOH, you increase the possibilities for concurrency and more
> 
> efficient use of resources (why have N "raise gate" processes
> 
> if drivers can afford to wait for THEIR gate to be lifted?
> 
> Perhaps the gate lift mechanism can ONLY lift a single gate at
> 
> a time (motor and gears/clutches).
> 
> 
> 
> Sorry for the long-winded explanation.  I will promptly be derided
> 
> for it.  But, hopefully it shows you different approaches (that
> 
> exploit "potential parallelism/decomposition" in different ways)
> 
> and the potential consequences of different approaches.
> 
> 
> 
> You have to look at your workload and see what approach makes the
> 
> most sense.  Interconnections are expensive in any algorithm!

Reply by Karthik Balaguru ●September 26, 20142014-09-26

Hi George,

Thanks for pointing it out. I agree that clone can be really handy in system architectures based on multiple threads running concurrently in shared memory space by controlling different levels of sharing between the parent and child tasks uses flags.

Karthik

On Tuesday, 23 September 2014 00:45:58 UTC+5:30, George Neuner  wrote:
> On Sun, 21 Sep 2014 05:40:26 -0700 (PDT), Karthik Balaguru
> 
> <karthik.balaguru007@gmail.com> wrote:
> 
> 
> 
> >Processes are heavy weight and they appear to occupy more memory,
> 
> >more time to create/start, increased latency during context 
> 
> >switches and separate memory space that necessitates heavy IPC 
> 
> >mechanisms. Threads are light weight and share memory space.
> 
> 
> 
> Yes and no.  
> 
> 
> 
> The best way to think of a process is as a resource container - a
> 
> thread is a particular kind of resource (a computation resource) that
> 
> a process can contain.
> 
> 
> 
> Processes also typically are protection boundaries whereas threads
> 
> typically are not [though there are exceptions].
> 
> 
> 
> 
> 
> >Considering the development environment as Linux OS with C 
> 
> >language on single core/multi-core processors, i would like to 
> 
> >know for which type of applications should we need to go in 
> 
> >for multi-threaded software architecture and for which type 
> 
> >of applications should we need to go in for multiple process
> 
> >based software architecture ?
> 
> 
> 
> In Linux there is a system call "clone" (see clone(2)).  Clone
> 
> essentially creates new threads, but it permits detaching a new thread
> 
> into a separate process and specifying with relatively fine control
> 
> what parent resources should be copied to the child.
> 
> 
> 
> Clone wraps an even lower level call that provides even more control
> 
> over the environment of the new thread.  Using clone you can create
> 
> very lightweight processes, e.g., just a thread with MMU protection.
> 
> 
> 
> >Karthik
> 
> George

Reply by Don Y ●September 26, 20142014-09-26

Hi Karthik,

On 9/26/2014 8:11 AM, Karthik Balaguru wrote:

> Thanks for your quick reply !
> That was indeed a pretty long & an interesting explanation !!

The point is to show how a single application can be approached in a
variety of different ways.  And, within those different approaches,
how the "sharing"/contention can manifest -- or not.

Finally, the relative differences in costs between process vs. threaded
implementations when it comes to that sharing.

Figuring out how to approach YOUR problem (how to decompose it) will
be your first step to determining the "most effective" implementation.

Reply by Karthik Balaguru ●September 26, 20142014-09-26

On Monday, 22 September 2014 00:53:32 UTC+5:30, Tim Wescott  wrote:
> On Sun, 21 Sep 2014 05:40:26 -0700, Karthik Balaguru wrote:
> 
> 
> 
> > Hi,
> 
> > Have few queries on the best possible software architecture.
> 
> > 
> 
> > Processes are heavy weight and they appear to occupy more memory, more
> 
> > time to create/start, increased latency during context switches and
> 
> > separate memory space that necessitates heavy IPC mechanisms. Threads
> 
> > are light weight and share memory space. However, I realized that
> 
> > threads also enter into contention for resources/memory due to the
> 
> > shared resources among them that inturn becomes a kind of bottle neck
> 
> > for multi-threaded architecture but not for multiple process based
> 
> > architecture. Also the workaround for having thread local storage does
> 
> > not seem to be straight forward. This also makes me believe that
> 
> > maintaining multi-threaded application can be bit complex compared to
> 
> > that of multiple process architecture. Also that the performance of
> 
> > multi process architecture will be better due to separate memory space
> 
> > (This avoids locking or serialization of execution in case of multi
> 
> > process architecture) and this seems to take away the advantage of less
> 
> > context switch time in case of multi-threaded application !! Kindly let
> 
> > me know if this understanding is correct or correct with appropriate
> 
> > inputs.
> 
> > 
> 
> > I understand that the software architecture is mainly based on the type
> 
> > of application/requirement. Considering the development environment as
> 
> > Linux OS with C language on single core/multi-core processors, i would
> 
> > like to know for which type of applications should we need to go in for
> 
> > multi-threaded software architecture and for which type of applications
> 
> > should we need to go in for multiple process based software architecture
> 
> > ? Is there any matrix sheet  that maps/lists the type of
> 
> > requirements/applications and the possible software architecture for it
> 
> > ?
> 
> 
> 
> You are mistaken in your notion that just because threads explicitly share 
> 
> a memory space and other resources that they have more contention.  
> 
> Threads share a memory _space_, but they don't have to use the same bits 
> 
> of memory within that space -- it is easy to set things up so that each 
> 
> thread has its own chunk of memory that it uses.
> 
> 
> 
> In a sense, once you get past the MMU, processes share the same memory 
> 
> space, too -- it's just that the MMU protects each process from having to 
> 
> know about the memory space occupied by other processes, or even, for that 
> 
> matter, from having to know what physical addresses it occupies.
> 
> 
> 
> The "processes have separate memory space" is an illusion, provided in 
> 
> hardware by the MMU.  At the point where activity is going on in physical 
> 
> memory, all the processes have to access the same memory space, and so 
> 
> they contend for that resource.  Ditto hard drive accesses, screen access, 
> 
> etc.
> 
> 
> 
> Really, the biggest thing that you give up with threads vs. processes is 
> 
> that -- assuming the OS is doing its job -- processes are safe from one 
> 
> another.  Threads, however, can easily stomp on one another, simply by 
> 
> writing into some part of memory that some other thread is using and 
> 
> thinks isn't going to be disturbed.
> 
> 
> 
> For me, the dividing line between threads and processes is one of work 
> 
> load, processor loading, and trust: do I trust whoever is developing that 
> 
> software entity over there not to stomp on my stuff, and is it less work 
> 
> for both of us, using threads, to not stomp on each other's stuff than it 
> 
> is to just use processes?  And can the job be done at all using 
> 
> processes?  If the answer to the first two questions is "yes", then 
> 
> threads are indicated.  If the answer to _either_ of the first two 
> 
> questions is "no", then processes are indicated -- and if the answer to 
> 
> the third question is then "no", the project is in jeopardy.
> 

Hi Tim,
It is a very practical input. Also, the point of view based on the skillset of person in not stomping of another person's memory area is a really a good one to consider in any kind of project management and appears to be a real practical implementation check-point.

Thanks,
Karthik

Reply by ●September 27, 20142014-09-27

On Sun, 21 Sep 2014 05:40:26 -0700 (PDT), Karthik Balaguru
<karthik.balaguru007@gmail.com> wrote:

>Hi,
>Have few queries on the best possible software architecture.
>
>Processes are heavy weight and they appear to occupy more memory, more time to create/start, increased latency during context switches and separate memory space that necessitates heavy IPC mechanisms. Threads are light weight and share memory space. However, I realized that threads also enter into contention for resources/memory due to the shared resources among them that inturn becomes a kind of bottle neck for multi-threaded architecture but not for multiple process based architecture. Also the workaround for having thread local storage does not seem to be straight forward. This also makes me believe that maintaining multi-threaded application can be bit complex compared to that of multiple process architecture. Also that the performance of multi process architecture will be better due to separate memory space (This avoids locking or serialization of execution in case of multi process architecture) and this seems to take away the advantage of less context switch time in case of
>multi-threaded application !! Kindly let me know if this understanding is correct or correct with appropriate inputs.
>
>I understand that the software architecture is mainly based on the type of application/requirement. Considering the development environment as Linux OS with C language on single core/multi-core processors, i would like to know for which type of applications should we need to go in for multi-threaded software architecture and for which type of applications should we need to go in for multiple process based software architecture ? Is there any matrix sheet  that maps/lists the type of requirements/applications and the possible software architecture for it ?
>
>Thx in advans,
>Karthik

Why wondering about processes vs. threads, use both as I have done for
decades.

I prefer keeping individual programs relatively small to help
manageability, protection and updatable, with only a few threads
within each address space. For larger systems with multiple processes
and address spaces, just create some shared memory areas and map these
areas into multiple process address spaces. 

For simple items (byte/word/dword) that can be accessed atomically,
you don't need any synchronization, for complex items, process A moves
xx megabytes of data to a shared memory area and then sends using some
OS specific mechanism to process B "I just uploaded xx megabytes, go
ahead".

On modern virtual memory OSes (Linux/Windows) shared regions are
implemented as memory mapped files (with or without backup to a real
file). 

If the shared memory area is linked to a fixed virtual memory address,
it must also fit into the same virtual address in each process and you
can use pointers within that memory area directly. If the shared
memory can be loaded at any virtual address in each process, pointers
within each process in that shared area must be recalculated. 

For applications intended for a long (more than a decade) support, one
must be careful how that shared memory is structured. Put a version
number and pointers to key data structures into the absolute beginning
of that shared area and in that way, process from different software
versions can access the same data structure and the shared data area
composition can be changed at will.

Previous12 Next

Thread based software architecture vs Process based software architecture

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group