EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Self restarting property of RTOS-How it works?

Started by Unknown February 7, 2005
CBFalconer wrote:

> Tim Wescott wrote: > > ... snip ... > >>The Therac 25 killed because of piss-poor design in a piss-poor >>environment that allowed piss-poor software quality -- probably >>because people had the attitude that mistakes happen and errors >>that may kill me or you are acceptable as long as money is being >>made. For details on the Therac 25 accidents see >> http://www.embedded.com/showArticle.jhtml?articleID=55300689 and >> http://sunnyday.mit.edu/papers/therac.pdf. When those tasks >>"went for a toss" it was because several somebodies weren't paying >>a hell of a lot of attention. >> >>The _only_ excuse that can be granted is that the Therac 25 was >>one of the first systems where the software had the power to kill >>people. It's been 20 years, we should know better now. These >>days going to exceptional lengths to insure software quality in >>life-critical applications is no more playing god than changing >>your tires when they go bald, it's simply doing what should be >>done. > > > I was building software that could kill people 10 years before > that. I was aware of it, and it didn't. By the late '70s my > software was almost all implemented in Pascal (real ISO standard > Pascal, with full range checking etc. - not the Borland mixture) > with everything from the naked board up under my control. Yes, > there were failures, usually hardware, but the effect was to stop > and scream for human help. > > It doesn't take a genius to recognize potential pitfalls. People > have been building fail-safe mechanisms for centuries. >
It was nearly 1:00AM, and I left out the "in the medical industry". Just out of curiosity what industry were you working in? I have noticed that there is a strong tendency for people to view the software engineering process as somehow fundamentally different from mechanical, electrical or other "old-line" engineering processes -- so where there will be significant controls and reviews on mechanical and electrical assemblies, a software engineer may be allowed to compile something, test it for two minutes, and ship product. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Tim Wescott wrote:
> I have noticed that there is a strong tendency for people to view the > software engineering process as somehow fundamentally different from > mechanical, electrical or other "old-line" engineering processes -- so > where there will be significant controls and reviews on mechanical and > electrical assemblies, a software engineer may be allowed to compile > something, test it for two minutes, and ship product.
I attended a lecture by Jack Ganssle on ESC last September in which he talked about a software engineer whose software error rate was way lower than their co-workers'. The guy was formerly a hardware engineer. When asked about it, the guy said nobody told him he was allowed to err. The pervasive "bugs are unavoidable" attitude of software engineering professionals has perhaps established a tolerance for errors much higher than it would be acceptable. The fact "it's just software, please add this and this last minute feature and fix" doesn't help at all. Just my EU0.02 Elder.
> > You can write crappy code in any language. > > You can write solid, quality code in any language. >
Clearly both statements are true. The question is: does the tool set help or hinder the programmer? can average programmers find and fix subtle bugs more or less easily than with another system? is the system so large and complex that even briliant programmers need special tools to achieve acceptable reliability?
> I liken writing life-critical code to driving a car full of sleeping > children: I can pretty much do anything I choose, but some of my choices > can be quite disastrous for a number of innocent lives. If _all_ layers > of your organization take responsibility for the lives of the people who > can be affected by the product, and if the people working with the > software are aware of the known methods for developing then chances are > the software will be up to snuff. It doesn't really matter whether the > language is C, Ada, assembly or COBOL (although a flight-control or > biomedical system written in COBOL boggles the mind). > > It ain't the code, it's the coders that make the difference. > > Tim Wescott > Wescott Design Services > http://www.wescottdesign.com
Sometimes you need more than just conscientious programmers. Sometimes tools and more advanced languages and operating systems pay for themselves by reducing the cost to achieve a given level of reliability. Mike Sicilian
Tim Wescott wrote:
>
... snip ...
> > It was nearly 1:00AM, and I left out the "in the medical industry". > Just out of curiosity what industry were you working in?
<http://cbfalconer.home.att.net> -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson
If you restart a task you restart it you do not set it back to working.  If it is crashed (tossed) there is no place
to go to resume.  Only a paused task may resume.

The Question everyone is pointing to is:
1. Crashing is always a software problem.  No input should cause the code to get lost.
2. Restarting crashed task may be a bad Idea. For the Therac example suppose the dose task dies. If you restart it
does it try to give another dose?  What if it keeps crashing and restarting. Can the system figure out if the
restart is helping or making it worse?

One use for restart would be (to use my battery charge as an example, though I did no do it this way)
The battery Charge Task has found a battery and is charging it.  The removal  Task has fond the battery was
removed.  It could restart the Charge Task this would cause it to Restart to look for a new battery.  This may or
may not work better than sending message or flag.

I respond to a crashed Task by forcing a watchdog reset.  I know that is safe and preferred in my system.

Hope this helps.



"Elder Costa" <elder.costa@terra.com.br> wrote in message
news:36vf5kF55ik1aU1@individual.net...
> Tim Wescott wrote: > > I have noticed that there is a strong tendency for people to view
the
> > software engineering process as somehow fundamentally different from > > mechanical, electrical or other "old-line" engineering processes --
so
> > where there will be significant controls and reviews on mechanical
and
> > electrical assemblies, a software engineer may be allowed to compile > > something, test it for two minutes, and ship product. > > I attended a lecture by Jack Ganssle on ESC last September in which he > talked about a software engineer whose software error rate was way
lower
> than their co-workers'. The guy was formerly a hardware engineer. When > asked about it, the guy said nobody told him he was allowed to err.
The
> pervasive "bugs are unavoidable" attitude of software engineering > professionals has perhaps established a tolerance for errors much
higher
> than it would be acceptable. The fact "it's just software, please add > this and this last minute feature and fix" doesn't help at all. > > Just my EU0.02 > > Elder.
Like hardware doesn't have bugs or "errata". Go check Intel's web site. Especially new hardware. sheesh
s_subbarayan@rediffmail.com wrote:

>Dear all, > I was going through the explaination for taskRestart() call in the >vxworks reference manual.
===SNIP===
>Vxworks reference manual for taskRestart() states the following: >"This routine "restarts" a task. The task is first terminated, and then >reinitialized with the same ID, priority, options, original entry >point, stack size, and parameters it had when it was terminated.
===SNIP===
>After restarting will the task start executing from the starting point >or will execute from where it left?
From the starting point.
>Incase it starts executing from scratch from the place where its >spawned,will it not cause problems to my application?
That is for you to determine. I have no idea how it would affect *your* application. I only know how it would affect *mine*.
>For eg,I am in >middle of some biomedical application and some task goes for a toss and >I restart it using this call,the task will start performing everything >from scratch which may not be suitable to the current status of >application...
So it sounds like you've answered your question -- taskRestart() "may not be suitable...". -- Dan Henry
Neil Kurzman wrote:

>1. Crashing is always a software problem.
Only if you define "crashing" as a subset "software problem." Your code can be perfect, but if the crystal oscillator stops it *will* crash.
On 2005-02-10, Guy Macon <_see.web.page_@_www.guymacon.com_> wrote:
> > Neil Kurzman wrote: > >>1. Crashing is always a software problem. > > Only if you define "crashing" as a subset "software problem." > > Your code can be perfect, but if the crystal oscillator stops > it *will* crash.
No it won't; it just won't get anywhere. I mean, even if the DRAM disappears because it's not getting refreshed anymore, the software won't crash because time has stopped. It's frozen. -- Roger Ivie rivie@ridgenet.net http://anachronda.webhop.org/ -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS/P d- s:+++ a+ C++ UB--(++++) !P L- !E W++ N++ o-- K w O- M+ V+++ PS+ PE++ Y+ PGP t+ 5+ X-- R tv++ b++ DI+++ D+ G e++ h--- r+++ z+++ ------END GEEK CODE BLOCK------
Elder Costa wrote:
> Tim Wescott wrote: > >> I have noticed that there is a strong tendency for people to view the >> software engineering process as somehow fundamentally different from >> mechanical, electrical or other "old-line" engineering processes -- so >> where there will be significant controls and reviews on mechanical and >> electrical assemblies, a software engineer may be allowed to compile >> something, test it for two minutes, and ship product. > > > I attended a lecture by Jack Ganssle on ESC last September in which he > talked about a software engineer whose software error rate was way lower > than their co-workers'. The guy was formerly a hardware engineer. When > asked about it, the guy said nobody told him he was allowed to err. The > pervasive "bugs are unavoidable" attitude of software engineering > professionals has perhaps established a tolerance for errors much higher > than it would be acceptable. The fact "it's just software, please add > this and this last minute feature and fix" doesn't help at all.
I have noted previously here in c.a that most of the _really_ good low-level/systems programmers I know seem to have an engineering instead of computer science background. A coincidence? Terje -- - <Terje.Mathisen@hda.hydro.com> "almost all programming can be viewed as an exercise in caching"

Memfault Beyond the Launch