EmbeddedRelated.com
Forums
Memfault Beyond the Launch

When exactly do you choose to use a RTOS (instead of a non-OS approach)?

Started by pozz December 9, 2017
Tom Gardner wrote:
> On 06/01/18 03:29, Les Cargill wrote: >> Tom Gardner wrote: >>> On 04/01/18 19:31, rickman wrote: >>>> rickman wrote on 1/4/2018 3:33 AM: >>>>> Les Cargill wrote on 1/3/2018 6:59 PM: >>>>>> rickman wrote: >>>>>>> Les Cargill wrote on 1/3/2018 7:07 AM: >>>>>>>> rickman wrote: >>>>>>>>> Ed Prochak wrote on 12/19/2017 11:19 AM: >>>>>>>>>> On Saturday, December 16, 2017 at 12:56:31 PM UTC-5, Les >>>>>>>>>> Cargill wrote: >>>>>>>>>>> Mike Perkins wrote: >>>>>>>>>>>> On 09/12/2017 16:05, pozz wrote: >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I use it where I have a number of 'tasks' which then >>>>>>>>>>>> interact with each >>>>>>>>>>>> other. >>>>>>>>>>>> >>>>>>>>>>>> If your system is a pure state machine then there is no need >>>>>>>>>>>> for an >>>>>>>>>>>> RTOS. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Indeed. Indeed. >>>>>>>>>> >>>>>>>>>> If it is just one state machine. Then yes, indeed. >>>>>>>>>> ed >>>>>>>>> >>>>>>>>> It can always be one state machine.  The only issue is how >>>>>>>>> complex the >>>>>>>>> combined state machine is. >>>>>>>>> >>>>>>>> >>>>>>>> It's always a case of "how long should a man's legs be?" Lincoln >>>>>>>> is alleged to have said "long enough to touch the ground." >>>>>>>> >>>>>>>>> Actually the issue is not how many state machines you have.  It >>>>>>>>> is the >>>>>>>>> timing requirements of the various state machines.  If your >>>>>>>>> timing is lax, >>>>>>>>> sequential operation of multiple machines is easy.  But this >>>>>>>>> often becomes >>>>>>>>> a huge discussion with everyone talking past each other. >>>>>>>>> >>>>>>>> >>>>>>>> They should use state machines to keep track then :) >>>>>>>> >>>>>>>> If your system can be decomposed as roughly events cross >>>>>>>> state, you have a snowball's chance of "understanding" it. >>>>>>> >>>>>>> Sorry, I don't know what this means "roughly events cross state". >>>>>>> >>>>>>> >>>>>> >>>>>> You are in state A. Event 42 occurs. Lookup.... ah, here - >>>>>> if we get event 42 whilst in state A, we move to state B >>>>>> and send message m. >>>>> >>>>> So what makes that so hard to understand? >>>> >>>> To be more explicit, I think every FSM I've ever coded was along the >>>> lines of a >>>> case statement on the state with IF conditions on all the >>>> interesting inputs.  I >>>> find this structure to be self documenting if the signal names are >>>> chosen well. >>>> >>>> Why is this structure hard to understand? >>> >>> I've seen a commercial case where that type of structure was >>> mutated by people under pressure to: >>>   - have a single state machine for all different customers >>>   - make the minimum change >>>   - do it fast >>>   - where the original designers had left >>>   - a custom domain specific language, ugh >>> >>> The result was an unholy unmaintainable mess, with >>> if-then-elses nested up to 10 (ten!) deep. >>> >> >> Eh, doing it totally, totally wrong. With all due respect - you really >> need a configuration system managed well within the code itself, and >> customer specializations are but different settings. > > I didn't have due respect; I had due disrespect. > > It was all contained in a configuration system, that > bloated wrong-headed pile of ordure IBM Rational > Clearcase. >
We've all been there. I worked one place where the software chief published a config spec daily ( or even more often ). This being said, CM people have a natural affinity for ClearCase.
> I once, and only once, looked at the version trees. > For a couple of years it was respectable and sane: > a trunk with a few branches. Then it exploded over > the screen and looked like a plant with a serious > enzyme disorder: there were even backwards branching > loops. I have no idea how they achieved that nor why > (other that get it out the door tomorrow). > > >> On one product, the "model number' was nothing more than a name >> that pointed to a hard table of defaults for the configuration, >> and further config changes were possible. If the support >> people requested it, we'd grow a model number when they got too painful. >> >> That being said - OO and configuration go together like chalk and cheese >> :) >> >>> Completely insane, of course. >>> >> >> Well, yeah. You can make anything disgusting >> if you work hard enough :) > > And they did work hard enough; enhancements were > positively sclerotic. > > > >>> My preference is for a dispatch table based on event+state, >>> since that forces you to consider all possibilities. >> >> Yup yup. And you can generate a test vector for it with combinators. >> Then testing essentially becomes compressing the output - tedious, but >> ... rewarding. >> >> If the tested FSM is properly tested, then all remaining >> defects are 1) either tweak/tone of unrealistic expectations or >> 2) hardware bugs. >> >>> The >>> dispatch table can be either a 2D array of function pointers, >>> or inherent in an OOP inheritance hierarchy where the >>> hierarchy is a direct mirror of the state hierarchy. >> >> >> There's something to be said for the 2d approach. It's >> more canonical ( and more normal ) for one. > > Agreed, but it can be convenient to express the design > in other terms. For example: >  - the top level catches events that have not been dealt >    with elsewhere; I prefer this to be a "should not >    happen" which is logged >  - next level down is divided into a small number of >    superstates, maybe "initialising", "working normally", >    "gross fault recovery" >  - bottom level is the normal states, e.g. "door open", >    "door closed", "fire alarm", etc > Events are "delivered" to the bottom level and, if not > handled, percolate up to the next level. Levels naturally > correspond to an OOP class hierarchy. >
To an extent, that's true. It seemed to make the systems engineers happier than the software guys.
> That's nothing new, see Harel's StateCharts. Having said > that, I'm not overly fond of some aspects of StateCharts.
Right. But they're adaptable. -- Les Cargill
Tom Gardner wrote:
> On 06/01/18 03:29, Les Cargill wrote: >> Tom Gardner wrote: >>> My preference is for a dispatch table based on event+state, >>> since that forces you to consider all possibilities. >> >> Yup yup. And you can generate a test vector for it with combinators. >> Then testing essentially becomes compressing the output - tedious, but >> ... rewarding. >> >> If the tested FSM is properly tested, then all remaining >> defects are 1) either tweak/tone of unrealistic expectations or >> 2) hardware bugs. > > Unfortunately you can't test FSMs "properly", for my > definition of "properly" :) That's based on the > unfashionable and inconvenient concept that "you > can't test quality into a product". >
Wait; you *can* do proofs on FSMs. "Quality" is sufficiently nebulous and subjective to be a not-a-thing.
> For the systems I've designed, there's a lot to > be gained by keeping a compressed log of "state > trajectory" around until not needed.
Yep yep....
> That plus > accurate timestamping of events to/from external > systems has enabled me to quickly and unambiguously > deflect blame away from my stuff and onto other > companies. The lawyers were never even called :) >
Exactly. Precisely even that. -- Les Cargill
Paul Rubin wrote:
> Les Cargill <lcargill99@comcast.com> writes: >> It depends on your appetite for indeterminacy; I say if >> you're leaning on the timer tick to operate, you have multiple >> latent bugs. > > A complex enough program will have latent bugs no matter what.
Not... necessarily. If the complexity means it's never truly finished, then yes.
> If the > application critically depends on having no bugs, that dependence is a > bug in its own right.
Well, I think we should all strive for that regardless. IMO, it doesn't cost more to do things in a rigorous and careful way. I'd point you to the writings of Bruce Powell-Douglas. That, of course, is 180 degrees opposite of what many think of as modern software practice.
> A reliable system has to be able to recover from > faults including software bugs. Erlang/OTP is set up so if something > goes wrong in an operation, the process handling it crashes and a > supervision process restarts it so things are in a known state again.
Yep.
> This is enough to recover from quite a lot of unexpected problems. You > look at the crash log the next day, figure out what happened, and roll > out a fix. You can even upgrade the software while it's still running. >
It's a heck of a nice system. -- Les Cargill
upsidedown@downunder.com wrote:
> On Sat, 6 Jan 2018 09:45:31 +0000, Tom Gardner > <spamjunk@blueyonder.co.uk> wrote: > >> On 06/01/18 00:35, Paul Rubin wrote: >>> Tom Gardner <spamjunk@blueyonder.co.uk> writes: >>>> Processors with up to 32 cores and 4000MIPS, and interrupt latencies >>>> of 10ns. >>> >>> Why stop there? http://greenarraychipss.com >> >> 1) I believe you can simply add more chips >> in parallel, and the comms simply works, albeit >> with increased latency >> >> 2) hardware is easy; the software is more difficult >> and more important. XMOS has very good software >> support (based on 40 years of theory and practical >> implementations), plus excellent integration with >> the hardware. > > The xCore style architecture is nice for multichannel DSP > applications, in which each channel is assigned a dedicated core and > the single sampling clock is routed to all cores, starting the > execution of all cores at each sample clock transition. > > The xCore could also be used to implement PLCs (Programmable Logic > Controller) with each execution loop executed in a dedicated core, > usually with different clocks for each loop. Quite a lot of problems > can be solved in a PLC type environment and the IEC 61131 programming > environment is quite handy. IEC 61131 has multiple kinds of > programming languages, e.g. ladder logic or ST (Structured Text, a > Modula/Pascal style programming language). >
IMO, the systems software end of PLCs is a slow-moving train wreck. They do what they do very well but nobody who uses them seems very fond of what they are capable of. Ladder logic is excellent, but rather limited. And part of that is that SCADA protocols seem to be just bizarre and... rent-seekey. I know ( and have done some work for ) someone who is using RasPi and Arduino-inspired custom hardware and it's a whale of a lot richer software suite than traditional SCADA/PLC.
> However, I do _not_ think that the xCore would be very handy for ad > hoc parallel processing, even with XMOS programming environment (which > is quite versatile). >
It seems like total overkill for SCADA in general. I don't know if it means you can have GPP solutions that compete with ASICs ( I seriously doubt it ) but the economics of ASICs have gone mad anyway :)
>> >> 3) look at the investor lists for each company >> >> IMNSHO, point 2 is the killer advantage/USP. >
-- Les Cargill
Paul Rubin wrote on 1/4/2018 7:46 PM:
> rickman <gnuarm.deletethisbit@gmail.com> writes: >> Why is this structure hard to understand? > > It's just that the decomposition into state transitions obscures what > the program is supposed to do at a higher level. As an extreme example, > consider compiling a regular expression into a DFSM and look how opaque > the DFSM is by comparison to the original regex.
I guess we design very different devices. I can't think of a time when I didn't start analysis of my problem with a FSM diagram. -- Rick C Viewed the eclipse at Wintercrest Farms, on the centerline of totality since 1998
Tom Gardner wrote on 1/5/2018 4:12 AM:
> On 04/01/18 19:31, rickman wrote: >> rickman wrote on 1/4/2018 3:33 AM: >>> Les Cargill wrote on 1/3/2018 6:59 PM: >>>> rickman wrote: >>>>> Les Cargill wrote on 1/3/2018 7:07 AM: >>>>>> rickman wrote: >>>>>>> Ed Prochak wrote on 12/19/2017 11:19 AM: >>>>>>>> On Saturday, December 16, 2017 at 12:56:31 PM UTC-5, Les Cargill wrote: >>>>>>>>> Mike Perkins wrote: >>>>>>>>>> On 09/12/2017 16:05, pozz wrote: >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I use it where I have a number of 'tasks' which then interact with >>>>>>>>>> each >>>>>>>>>> other. >>>>>>>>>> >>>>>>>>>> If your system is a pure state machine then there is no need for an >>>>>>>>>> RTOS. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Indeed. Indeed. >>>>>>>> >>>>>>>> If it is just one state machine. Then yes, indeed. >>>>>>>> ed >>>>>>> >>>>>>> It can always be one state machine. The only issue is how complex the >>>>>>> combined state machine is. >>>>>>> >>>>>> >>>>>> It's always a case of "how long should a man's legs be?" Lincoln >>>>>> is alleged to have said "long enough to touch the ground." >>>>>> >>>>>>> Actually the issue is not how many state machines you have. It is the >>>>>>> timing requirements of the various state machines. If your timing is >>>>>>> lax, >>>>>>> sequential operation of multiple machines is easy. But this often >>>>>>> becomes >>>>>>> a huge discussion with everyone talking past each other. >>>>>>> >>>>>> >>>>>> They should use state machines to keep track then :) >>>>>> >>>>>> If your system can be decomposed as roughly events cross >>>>>> state, you have a snowball's chance of "understanding" it. >>>>> >>>>> Sorry, I don't know what this means "roughly events cross state". >>>>> >>>>> >>>> >>>> You are in state A. Event 42 occurs. Lookup.... ah, here - >>>> if we get event 42 whilst in state A, we move to state B >>>> and send message m. >>> >>> So what makes that so hard to understand? >> >> To be more explicit, I think every FSM I've ever coded was along the lines >> of a >> case statement on the state with IF conditions on all the interesting >> inputs. I >> find this structure to be self documenting if the signal names are chosen >> well. >> >> Why is this structure hard to understand? > > I've seen a commercial case where that type of structure was > mutated by people under pressure to: > - have a single state machine for all different customers > - make the minimum change > - do it fast > - where the original designers had left > - a custom domain specific language, ugh > > The result was an unholy unmaintainable mess, with > if-then-elses nested up to 10 (ten!) deep. > > Completely insane, of course. > > My preference is for a dispatch table based on event+state, > since that forces you to consider all possibilities. The > dispatch table can be either a 2D array of function pointers, > or inherent in an OOP inheritance hierarchy where the > hierarchy is a direct mirror of the state hierarchy.
Bad code can be written by any method. -- Rick C Viewed the eclipse at Wintercrest Farms, on the centerline of totality since 1998
Tom Gardner wrote on 1/6/2018 4:17 AM:
> On 06/01/18 03:29, Les Cargill wrote: >> Tom Gardner wrote: >>> My preference is for a dispatch table based on event+state, >>> since that forces you to consider all possibilities. >> >> Yup yup. And you can generate a test vector for it with combinators. >> Then testing essentially becomes compressing the output - tedious, but >> ... rewarding. >> >> If the tested FSM is properly tested, then all remaining >> defects are 1) either tweak/tone of unrealistic expectations or >> 2) hardware bugs. > > Unfortunately you can't test FSMs "properly", for my > definition of "properly" :) That's based on the > unfashionable and inconvenient concept that "you > can't test quality into a product".
FSM is one of the things where this isn't true. You can set up a unit test that exhaustively tests all possible combinations of inputs and states to verify the machine matches the specification.
> For the systems I've designed, there's a lot to > be gained by keeping a compressed log of "state > trajectory" around until not needed. That plus > accurate timestamping of events to/from external > systems has enabled me to quickly and unambiguously > deflect blame away from my stuff and onto other > companies. The lawyers were never even called :)
-- Rick C Viewed the eclipse at Wintercrest Farms, on the centerline of totality since 1998
On 05/01/18 00:46, Paul Rubin wrote:
> rickman <gnuarm.deletethisbit@gmail.com> writes: >> Why is this structure hard to understand? > > It's just that the decomposition into state transitions obscures what > the program is supposed to do at a higher level. As an extreme example, > consider compiling a regular expression into a DFSM and look how opaque > the DFSM is by comparison to the original regex.
As I'm sure you know, that's merely one example of the use of FSMs, and one where the regexp is the more comprehensible expression of the intent. In such cases, use regexps to express the operation. But for many embedded systems, FSMs are (or IMNSHO ought to be!) a key component in structuring and expressing the design. Canonical examples are ATMs, car cruise control, comms/networking protocols etc.
rickman <gnuarm.deletethisbit@gmail.com> writes:
> FSM is one of the things where this isn't true. You can set up a unit > test that exhaustively tests all possible combinations of inputs and > states to verify the machine matches the specification.
You can't exhaustively test all the combinations because there's an infinite set of possible input event sequences. Your spec might require, say, that no possible sequence of events can move the FSM from state A to state B without passing through state C in between. At the higher level, that could mean that you can't fire the missiles without releasing the safety. Maybe you can prove that invariant with model checking, but it's a complicated topic, not something that can be trivialized as a "unit test". This might be of interest: https://github.com/tomahawkins/improve/wiki/ImProve
On 06/01/18 22:35, Les Cargill wrote:
> Tom Gardner wrote: >> On 06/01/18 03:29, Les Cargill wrote: >>> Tom Gardner wrote: >>>> On 04/01/18 19:31, rickman wrote: >>>>> rickman wrote on 1/4/2018 3:33 AM: >>>>>> Les Cargill wrote on 1/3/2018 6:59 PM: >>>>>>> rickman wrote: >>>>>>>> Les Cargill wrote on 1/3/2018 7:07 AM: >>>>>>>>> rickman wrote: >>>>>>>>>> Ed Prochak wrote on 12/19/2017 11:19 AM: >>>>>>>>>>> On Saturday, December 16, 2017 at 12:56:31 PM UTC-5, Les Cargill wrote: >>>>>>>>>>>> Mike Perkins wrote: >>>>>>>>>>>>> On 09/12/2017 16:05, pozz wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I use it where I have a number of 'tasks' which then interact with >>>>>>>>>>>>> each >>>>>>>>>>>>> other. >>>>>>>>>>>>> >>>>>>>>>>>>> If your system is a pure state machine then there is no need for an >>>>>>>>>>>>> RTOS. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Indeed. Indeed. >>>>>>>>>>> >>>>>>>>>>> If it is just one state machine. Then yes, indeed. >>>>>>>>>>> ed >>>>>>>>>> >>>>>>>>>> It can always be one state machine. The only issue is how complex the >>>>>>>>>> combined state machine is. >>>>>>>>>> >>>>>>>>> >>>>>>>>> It's always a case of "how long should a man's legs be?" Lincoln >>>>>>>>> is alleged to have said "long enough to touch the ground." >>>>>>>>> >>>>>>>>>> Actually the issue is not how many state machines you have. It is the >>>>>>>>>> timing requirements of the various state machines. If your timing is >>>>>>>>>> lax, >>>>>>>>>> sequential operation of multiple machines is easy. But this often >>>>>>>>>> becomes >>>>>>>>>> a huge discussion with everyone talking past each other. >>>>>>>>>> >>>>>>>>> >>>>>>>>> They should use state machines to keep track then :) >>>>>>>>> >>>>>>>>> If your system can be decomposed as roughly events cross >>>>>>>>> state, you have a snowball's chance of "understanding" it. >>>>>>>> >>>>>>>> Sorry, I don't know what this means "roughly events cross state". >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> You are in state A. Event 42 occurs. Lookup.... ah, here - >>>>>>> if we get event 42 whilst in state A, we move to state B >>>>>>> and send message m. >>>>>> >>>>>> So what makes that so hard to understand? >>>>> >>>>> To be more explicit, I think every FSM I've ever coded was along the lines >>>>> of a >>>>> case statement on the state with IF conditions on all the interesting >>>>> inputs. I >>>>> find this structure to be self documenting if the signal names are chosen >>>>> well. >>>>> >>>>> Why is this structure hard to understand? >>>> >>>> I've seen a commercial case where that type of structure was >>>> mutated by people under pressure to: >>>> - have a single state machine for all different customers >>>> - make the minimum change >>>> - do it fast >>>> - where the original designers had left >>>> - a custom domain specific language, ugh >>>> >>>> The result was an unholy unmaintainable mess, with >>>> if-then-elses nested up to 10 (ten!) deep. >>>> >>> >>> Eh, doing it totally, totally wrong. With all due respect - you really >>> need a configuration system managed well within the code itself, and >>> customer specializations are but different settings. >> >> I didn't have due respect; I had due disrespect. >> >> It was all contained in a configuration system, that >> bloated wrong-headed pile of ordure IBM Rational >> Clearcase. >> > > We've all been there. > > I worked one place where the software chief published a > config spec daily ( or even more often ). > > This being said, CM people > have a natural affinity for ClearCase.
I was able to clear the Augean stable and base the next generation on plain old simple source code repositories. /Much/ easier and faster and simpler to operate. Naturally it is still /possible/ to create cancerous trees; the only thing preventing that is good taste :(
>> I once, and only once, looked at the version trees. >> For a couple of years it was respectable and sane: >> a trunk with a few branches. Then it exploded over >> the screen and looked like a plant with a serious >> enzyme disorder: there were even backwards branching >> loops. I have no idea how they achieved that nor why >> (other that get it out the door tomorrow). >> >> >>> On one product, the "model number' was nothing more than a name >>> that pointed to a hard table of defaults for the configuration, >>> and further config changes were possible. If the support >>> people requested it, we'd grow a model number when they got too painful. >>> >>> That being said - OO and configuration go together like chalk and cheese >>> :) >>> >>>> Completely insane, of course. >>>> >>> >>> Well, yeah. You can make anything disgusting >>> if you work hard enough :) >> >> And they did work hard enough; enhancements were >> positively sclerotic. >> >> >> >>>> My preference is for a dispatch table based on event+state, >>>> since that forces you to consider all possibilities. >>> >>> Yup yup. And you can generate a test vector for it with combinators. >>> Then testing essentially becomes compressing the output - tedious, but >>> ... rewarding. >>> >>> If the tested FSM is properly tested, then all remaining >>> defects are 1) either tweak/tone of unrealistic expectations or >>> 2) hardware bugs. >>> >>>> The >>>> dispatch table can be either a 2D array of function pointers, >>>> or inherent in an OOP inheritance hierarchy where the >>>> hierarchy is a direct mirror of the state hierarchy. >>> >>> >>> There's something to be said for the 2d approach. It's >>> more canonical ( and more normal ) for one. >> >> Agreed, but it can be convenient to express the design >> in other terms. For example: >> - the top level catches events that have not been dealt >> with elsewhere; I prefer this to be a "should not >> happen" which is logged >> - next level down is divided into a small number of >> superstates, maybe "initialising", "working normally", >> "gross fault recovery" >> - bottom level is the normal states, e.g. "door open", >> "door closed", "fire alarm", etc >> Events are "delivered" to the bottom level and, if not >> handled, percolate up to the next level. Levels naturally >> correspond to an OOP class hierarchy. >> > > To an extent, that's true. It seemed to make the systems > engineers happier than the software guys.
I wonder why? I've found the design pattern outlined above to be simple to comprehend and modify. It does, however, tend to need some boilerplate (hence ignorable) "trampoline" methods to get an incoming event to the appropriate method (event handler) in the appropriate class (state). Some might object to that, but often they can be places to insert logging functions.
>> That's nothing new, see Harel's StateCharts. Having said >> that, I'm not overly fond of some aspects of StateCharts. > > Right. But they're adaptable.
I merely used them as inspiration. Where there was a choice between compliance with Statechart semantics and fast simple code design patterns, I chose the latter.

Memfault Beyond the Launch