Reply by Phil Hobbs February 2, 20192019-02-02
On 2/2/19 11:44 AM, Kent Dickey wrote:
> In article <48816c44-ddb7-4c2b-8be8-c20d25833f8d@googlegroups.com>, > StateMachineCOM <statemachineguru@gmail.com> wrote: >>> @Phil Hobbs, @Clifford Heath, @Reinhardt Behm, @A.P.Richelieu: >> >> You guys cannot have it both ways. >> >> If you truly believe that your production code is so good that >> assertions will NEVER fire in the field, then why are you so afraid of >> leaving them in the code? Of course, assertions cause some overhead, but >> they give you that last line of defense to do SOMETHING when the system >> goes out of control. >> >> But you apparently ARE afraid that your assertions will fire too often >> in the field. (Otherwise you would not be talking of "the hammer being >> too big"). In this case, your solution is to disable assertions? And to >> do WHAT? Pray that the system will somehow miraculously recover and >> correct itself? What are the chances for this to work? > > You need to realize your assertions can have bugs just like your code. > In fact, I'm sure the bug rate for assertions is at least an order of > magnitude higher than for other code, possibly several orders of magnitude. > > Say you have a routine which returns a pointer to a structure, and a size of > that structure (so it can get bigger in the future). > > struct Thing *my_thing; > ret = get_a_thing(&my_thing, &size); > assert(size == sizeof(Thing)); > > In all of your testing, this will succeed, since Thing will match the > size being returned for now. > > But then an upgrade occurs, and the size returned is bigger. Now, the > code would work fine (the interface is designed to support this by adding > new fields in the future), but now your assert goes off. This is bad--you > basically have made your code fail on a system upgrade. > > This is a trivial example, but it gets at the issue--asserts can test for > finicky details which actually don't matter. And for tesitng, this is fine, > actually probably a good thing, since it makes code assumptions clear. > > You even point out that assertions make the code less robust--you're more > likely to crash. And you expose yourself to crashing in cases where ignoring > the "error" would be innocuous. I get very frustrated at old programs > which now fail due to stupid checks which they should not be doing. > I mean, I'd prefer a program to keep running and think it was the year 1900 > then to hard fail once year 2000 hit. > > Plus, the dirty secret no one wants to say out loud--buggy code that keeps > running is often more useful than code which "stops". We've got a lot of > experiential evidence that pretty massive bugs can often be worked around > to keep a system running. One downside, of course, is buggy code is often > a giant security hole. I don't think there's a magic wand which easily > balances the need to make the code useful and robust, and secure. > > So, what people are clearly saying is: you need at least 2 levels of > assertions, and the assert I gave as an example above definitely should not > be in production code (but is actually fine in your test builds). > In fact, I sometimes use assertions for things I don't know are true, just > to see if testing hits the case. These should not be in production code. > > Kent >
I think the religious war over assertions is mostly due to sloppy use of language, specifically whether "assertion" means "the C assert() macro or some near relative that calls abort() or does a hard reset" or "appropriate runtime error checking". All of our positions are closer than they appear. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net http://hobbs-eo.com
Reply by Kent Dickey February 2, 20192019-02-02
In article <48816c44-ddb7-4c2b-8be8-c20d25833f8d@googlegroups.com>,
StateMachineCOM  <statemachineguru@gmail.com> wrote:
>> @Phil Hobbs, @Clifford Heath, @Reinhardt Behm, @A.P.Richelieu: > >You guys cannot have it both ways. > >If you truly believe that your production code is so good that >assertions will NEVER fire in the field, then why are you so afraid of >leaving them in the code? Of course, assertions cause some overhead, but >they give you that last line of defense to do SOMETHING when the system >goes out of control. > >But you apparently ARE afraid that your assertions will fire too often >in the field. (Otherwise you would not be talking of "the hammer being >too big"). In this case, your solution is to disable assertions? And to >do WHAT? Pray that the system will somehow miraculously recover and >correct itself? What are the chances for this to work?
You need to realize your assertions can have bugs just like your code. In fact, I'm sure the bug rate for assertions is at least an order of magnitude higher than for other code, possibly several orders of magnitude. Say you have a routine which returns a pointer to a structure, and a size of that structure (so it can get bigger in the future). struct Thing *my_thing; ret = get_a_thing(&my_thing, &size); assert(size == sizeof(Thing)); In all of your testing, this will succeed, since Thing will match the size being returned for now. But then an upgrade occurs, and the size returned is bigger. Now, the code would work fine (the interface is designed to support this by adding new fields in the future), but now your assert goes off. This is bad--you basically have made your code fail on a system upgrade. This is a trivial example, but it gets at the issue--asserts can test for finicky details which actually don't matter. And for tesitng, this is fine, actually probably a good thing, since it makes code assumptions clear. You even point out that assertions make the code less robust--you're more likely to crash. And you expose yourself to crashing in cases where ignoring the "error" would be innocuous. I get very frustrated at old programs which now fail due to stupid checks which they should not be doing. I mean, I'd prefer a program to keep running and think it was the year 1900 then to hard fail once year 2000 hit. Plus, the dirty secret no one wants to say out loud--buggy code that keeps running is often more useful than code which "stops". We've got a lot of experiential evidence that pretty massive bugs can often be worked around to keep a system running. One downside, of course, is buggy code is often a giant security hole. I don't think there's a magic wand which easily balances the need to make the code useful and robust, and secure. So, what people are clearly saying is: you need at least 2 levels of assertions, and the assert I gave as an example above definitely should not be in production code (but is actually fine in your test builds). In fact, I sometimes use assertions for things I don't know are true, just to see if testing hits the case. These should not be in production code. Kent
Reply by A.P.Richelieu January 31, 20192019-01-31
Den 2019-01-30 kl. 21:31, skrev StateMachineCOM:
>> @Phil Hobbs, @Clifford Heath, @Reinhardt Behm, @A.P.Richelieu: > > You guys cannot have it both ways. > > If you truly believe that your production code is so good that assertions will NEVER fire in the field, then why are you so afraid of leaving them in the code? Of course, assertions cause some overhead, but they give you that last line of defense to do SOMETHING when the system goes out of control.
Because assertions add code size and reduces performance.
> > But you apparently ARE afraid that your assertions will fire too often in the field. (Otherwise you would not be talking of "the hammer being too big"). In this case, your solution is to disable assertions? And to do WHAT? Pray that the system will somehow miraculously recover and correct itself? What are the chances for this to work?
If you keep an assertion which fires in the field, then you have a bug in your program. The bug is that you did an assertion, instead of validating data. You basically misused the assertion for something it was not designed to handle.
> > This strikes me as backwards in the software business. Your code is either in complete control of the machine or it isn't. There is really nothing in between. > >> The hammer might be much too big. Would you like the engines >> in an aircraft to just stop because an assertion stopped the >> engine controller? > > Who said that a failing assertion must always stop the system? In the end YOU are designing the "hammer" (assertion handler), and so it is YOUR job to design it correctly for the circumstance: put the system in the fail-safe mode, whatever that means. And yes, I would prefer that an aircraft engine controller would reset and perhaps re-start the engine mid-air as opposed to blow up the engine. >
I do, assertions should only continue execution if it would be dangerous to stop.
>> But production code is without any such things. They all get >> removed and then the _changed_ software gets verified again. > > What you do in your development version is your business. If you choose to use "assertion-like" constructs, which aren't checking for conditions that should NOT happen (like array index out of bounds or de-referencing a NULL pointer), then of course you should remove them for production. > > But by removing ALL assertions in the final code you are throwing the baby out with the bathwater. You lose your last line of defense to do damage control and to protect your system and all its users.
Since assertions are supposed to be removed in production code, your design is flawed. You used assertions, instead of run-time checks. Learn the difference between assertions and run time checks.
> > Miro Samek > state-machine.com >
AP
Reply by Phil Hobbs January 30, 20192019-01-30
On 1/30/19 3:31 PM, StateMachineCOM wrote:
>> @Phil Hobbs, @Clifford Heath, @Reinhardt Behm, @A.P.Richelieu: > > You guys cannot have it both ways. > > If you truly believe that your production code is so good that > assertions will NEVER fire in the field, then why are you so afraid > of leaving them in the code?
< Attempt to generate controversy over an old article noted.> ;) Afraid? Who's afraid?
> Of course, assertions cause some overhead, but they give you that > last line of defense to do SOMETHING when the system goes out of > control.
assert() is too blunt an instrument to use as a general solution, because calling abort() or doing a hard reset is not necessarily a good defense in all situations. It might fit some situations, but it can make things worse in others. A more nuanced assert()-like approach, such as the carefully() macro I posted upthread, is another matter. It was specifically designed to be left in the production code.
> > But you apparently ARE afraid that your assertions will fire too > often in the field. (Otherwise you would not be talking of "the > hammer being too big").
Because sometimes it is. We've supplied examples.
> In this case, your solution is to disable assertions? And to do > WHAT? Pray that the system will somehow miraculously recover and > correct itself? What are the chances for this to work?
Straw man alert. The choices are not limited to "leave all assert()s in production code" and "all the children will die!" There are lots of ways of doing runtime error checking. For instance, unit tests are loaded with checks for correct program logic. Maguire talks about a MC68000 disassembler whose error checking system included a complete alternative disassembler (logic- vs. table-driven iirc). Once you've exercised all the code paths, you don't need the second disassembler.
> > This strikes me as backwards in the software business. Your code is > either in complete control of the machine or it isn't. There is > really nothing in between.
There's a lot of daylight between wanting the software to be in control and issuing an edict forbidding #define NDEBUG.
> >> The hammer might be much too big. Would you like the engines in an >> aircraft to just stop because an assertion stopped the engine >> controller? > > Who said that a failing assertion must always stop the system?
In the event of failure, the C assert() macro calls abort().
> In the end YOU are designing the "hammer" (assertion handler), and > so it is YOUR job to design it correctly for the circumstance: put > the system in the fail-safe mode, whatever that means.
'When I make a word do a lot of work like that,' said Humpty Dumpty, 'I always pay it extra.' If by 'assertion', you merely mean 'appropriate error handling code hidden by a macro', we're in violent agreement. But to me at least, 'assertion' means the C library assert() facility or a close replica--something that applies the biggest available hammer--abort() or a hard reset in the event of failure.
> And yes, I would prefer that an aircraft engine controller would > reset and perhaps re-start the engine mid-air as opposed to blow up > the engine. > >> But production code is without any such things. They all get >> removed and then the _changed_ software gets verified again. > > What you do in your development version is your business. If you > choose to use "assertion-like" constructs, which aren't checking for > conditions that should NOT happen (like array index out of bounds or > de-referencing a NULL pointer), then of course you should remove > them for production. > > But by removing ALL assertions in the final code you are throwing the > baby out with the bathwater. You lose your last line of defense to do > damage control and to protect your system and all its users.
Nobody is advocating for removing all error checking in production code. There are certainly lots of cases where abort() or a hard reset are the right response, but that's far from universal. I use a whole lot of hard assertions in debug code, as I've said--more than I could stand in production, especially in an embedded system where code space is often at a premium. And of course since assertions have branches, they frequently mess up compiler optimizations. The least-bad bugs are the ones you find by code reading; next are ones the compiler catches, third are the ones assert() finds; fourth are runtime error checks in production; and the worst are the ones that just make the system crash in some uncontrolled way. Thus I like careful coding, static analysis, tight compiler flag settings, assert(), and carefully designed runtime error checking, in that order. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net http://hobbs-eo.com
Reply by StateMachineCOM January 30, 20192019-01-30
> @Phil Hobbs, @Clifford Heath, @Reinhardt Behm, @A.P.Richelieu:
You guys cannot have it both ways. If you truly believe that your production code is so good that assertions will NEVER fire in the field, then why are you so afraid of leaving them in the code? Of course, assertions cause some overhead, but they give you that last line of defense to do SOMETHING when the system goes out of control. But you apparently ARE afraid that your assertions will fire too often in the field. (Otherwise you would not be talking of "the hammer being too big"). In this case, your solution is to disable assertions? And to do WHAT? Pray that the system will somehow miraculously recover and correct itself? What are the chances for this to work? This strikes me as backwards in the software business. Your code is either in complete control of the machine or it isn't. There is really nothing in between.
> The hammer might be much too big. Would you like the engines > in an aircraft to just stop because an assertion stopped the > engine controller?
Who said that a failing assertion must always stop the system? In the end YOU are designing the "hammer" (assertion handler), and so it is YOUR job to design it correctly for the circumstance: put the system in the fail-safe mode, whatever that means. And yes, I would prefer that an aircraft engine controller would reset and perhaps re-start the engine mid-air as opposed to blow up the engine.
> But production code is without any such things. They all get > removed and then the _changed_ software gets verified again.
What you do in your development version is your business. If you choose to use "assertion-like" constructs, which aren't checking for conditions that should NOT happen (like array index out of bounds or de-referencing a NULL pointer), then of course you should remove them for production. But by removing ALL assertions in the final code you are throwing the baby out with the bathwater. You lose your last line of defense to do damage control and to protect your system and all its users. Miro Samek state-machine.com
Reply by A.P.Richelieu January 30, 20192019-01-30
Den 2019-01-28 kl. 21:31, skrev StateMachineCOM:
>> Assertions are not the same thing as checking your input. > > Absolutely. You need to very carefully distinguish between the erroneous behavior (a.k.a. bug) and exceptional condition, which is rare but can arise legitimately. Assertions are for errors. I've written specifically about it in the Dr.Dobb's article "An Exception or a Bug?" [http://www.drdobbs.com/an-exception-or-a-bug/184401686 ] > >> Assertions are there to check that your code is sane. >> They are designed to be removed in production code. > > I'm exactly challenging this beaten-path point of view, because it suggests to stop checking the sanity of the production code. This would work if *all* errors are completely removed during debugging. Are they really removed in YOUR code? >
Not really. As an example of an assertion which can be removed. We have an FPGA which contains registers. For various reasons, we may want to change the address or definition of registers. The FPGA tools + scripts will automatically generate a header file with register addresses. In my code I want to access the FPGA registers as a struct. I have assertions checking that the offset of each register in the struct matches the #defines in the automatically generated headers. If all assertions are OK then recompiling without those assertions will not cause a problem in a production system. If I however write the following code. int getnumber(void) { c = getchar(); assert(c >= '0'); assert(c <= '9'); return c - '0'; } Then I have totally misunderstood what assertions are all about. Here is another assertion that can be removed. // Check that table.var can accept all values of char #ifdef __ASSERT for c in char'range do: store c in table.var; assert(table.var == c) #endif // And here is another #ifdef __ASSERT for c in int'range do: store c in table.var; assert(table.var == c); // will fail on c == 256 #endif The latter example triggers an error, so the code needs changing. The code is not needed in production.
> And also, relevant for the OP, are you really suggesting to leave the watchdog in the production code while disabling other assertions. If so, WHY?
The Watchdog allows you to reset or interrupt the processing, which can be used to invoke recovery functions, including logging the error, allowing you to resume normal operation, and also to pinpoint the problem later. A hanging device will make noone happy.
> > I'm looking forward to interesting discussion... >
AP
Reply by Reinhardt Behm January 30, 20192019-01-30
AT Tuesday 29 January 2019 05:22, Phil Hobbs wrote:

> On 1/28/19 3:31 PM, StateMachineCOM wrote: >>> Assertions are not the same thing as checking your input. >> >> Absolutely. You need to very carefully distinguish between the >> erroneous behavior (a.k.a. bug) and exceptional condition, which is >> rare but can arise legitimately. Assertions are for errors. I've >> written specifically about it in the Dr.Dobb's article "An Exception >> or a Bug?" [http://www.drdobbs.com/an-exception-or-a-bug/184401686 ] >> >>> Assertions are there to check that your code is sane. They are >>> designed to be removed in production code. >> >> I'm exactly challenging this beaten-path point of view, because it >> suggests to stop checking the sanity of the production code. This >> would work if *all* errors are completely removed during debugging. >> Are they really removed in YOUR code? >> >> And also, relevant for the OP, are you really suggesting to leave the >> watchdog in the production code while disabling other assertions. If >> so, WHY? >> >> I'm looking forward to interesting discussion... >> > > A generally very sensible article. > > I'm all for having error checking in production code, but I don't call > those 'assertions'. I don't like the idea of leaving _assertions_ in, > though, because (a) abort() or a hard reset is a mighty big hammer to > apply that broadly, and (b) it deprives me of a very useful facility for > debugging, because I can't use as many of them as I want if they all > have to be left in the production builds.
The hammer might be much too big. Would you like the engines in an aircraft to just stop because an assertion stopped he engine controller? I always have a lot of assertions or assertion like constructs in my code to make sure during development that any errors get catched. But production code is without any such things. They all get removed and then the _changed_ software gets verified again. How can you otherwise make sure that every code path is tested during software integration and verification? This testing and verification has tp be performed with the same software as will be deploeyed. The assertions should never fire, so you would have untested code in the deployed software. -- Reinhardt
Reply by Tom Gardner January 29, 20192019-01-29
On 29/01/19 22:45, Clifford Heath wrote:
> On 30/1/19 4:30 am, StateMachineCOM wrote: >>> @Clifford Heath We had a set of assert macros that would abort in the >>> test environment, but return an error code when run in production so the >>> caller needed to explicitly ignore or handle the error condition. That >>> gives you proper feedback during testing but proper error handling in >>> prod. >> >> Seriously? Do you really believe that the error codes are checked and >> proper actions taken in *all* cases? > > I know they're not. But if you're leading a team of 40 software "engineers" > the best you can do is to make it easier to do things right. Poor error > management is probably the biggest cause of user dissatisfaction over the > entire history of the IT industry, and I did what I could to improve that, in > my small corner. > >> Isn't this just kicking the can down the road and into some other code, >> which is ill-prepared to "handle" your bugs? > > No. It's forcing the programmer to think about the situations where the error > might occur, and make decisions about how to notify the user (or the calling > code) about the problem, the reason, and the possible solutions. I.e. avoid > just saying "Unknown error 0x80000000".
Related, but slightly different... A good feature of checked exceptions in Java is that they force the programmer to either catch an exception thrown by a "library function" or declare that it could be thrown to whatever calls this code. Thus the possibility of errors has to be explicitly addressed (even if, as I've seen, the exception is caught and ignored). But that's "too much of a burden", so the modern practice is to only throw unchecked exceptions that aren't declared and checked by the compiler.
Reply by Clifford Heath January 29, 20192019-01-29
On 30/1/19 4:30 am, StateMachineCOM wrote:
>> @Clifford Heath >> We had a set of assert macros that would abort in the test >> environment, but return an error code when run in production >> so the caller needed to explicitly ignore or handle the >> error condition. That gives you proper feedback during >> testing but proper error handling in prod. > > Seriously? Do you really believe that the error codes are checked and proper actions taken in *all* cases?
I know they're not. But if you're leading a team of 40 software "engineers" the best you can do is to make it easier to do things right. Poor error management is probably the biggest cause of user dissatisfaction over the entire history of the IT industry, and I did what I could to improve that, in my small corner.
> Isn't this just kicking the can down the road and into some other code, which is ill-prepared to "handle" your bugs?
No. It's forcing the programmer to think about the situations where the error might occur, and make decisions about how to notify the user (or the calling code) about the problem, the reason, and the possible solutions. I.e. avoid just saying "Unknown error 0x80000000". Clifford Heath.
Reply by Phil Hobbs January 29, 20192019-01-29
On 1/29/19 12:30 PM, StateMachineCOM wrote:
>> @Clifford Heath We had a set of assert macros that would abort in >> the test environment, but return an error code when run in >> production so the caller needed to explicitly ignore or handle the >> error condition. That gives you proper feedback during testing but >> proper error handling in prod. > > Seriously? Do you really believe that the error codes are checked and > proper actions taken in *all* cases? Isn't this just kicking the can > down the road and into s
ome other code, which is ill-prepared to
> "handle" your bugs? > >> @Phil Hobbs So it's nice to leave assert() for debug and roll your >> own macro set for runtime. > > I'm not sure what you are proposing by "rolling your own" for > production code. What those "other versions" of assert macros in > production code are supposed to do?
Depends. For instance, in a clusterized EM simulation code from a dozen years ago, I have an assert-like macro called 'carefully'. A use example is in the simulation half-step that calculates the H field from the E field, which goes carefully(HfromE()); That's typically fairly time-consuming--tens of milliseconds to tens of seconds depending on the size of the simulation and the size of the cluster. The code has a tree-structured distributed supervisor scheme, a bit like a simple version of ganglia. So if some thread on some box finds a NaN or runs out of memory or something, I have to make sure all threads exit on all boxes, or the thing will hang forever waiting for the dead thread. (That's what CIS.CCom.SetStat() does.) #define carefully( x ) Carefully( (x), __FILE__, __LINE__) ... // We can't just abort the simulation when we run into some problem // in a subsidiary thread--we need to tell thread 1 before ending // the thread // NB: Don't use this function in the main thread! // // This is called via the Carefully macro, which expands into // Carefully(somefunctioncall(),__FILE__,__LINE__); void Carefully( int what, const char * fn, int Line) { if ( EMERROR_OK != what ) { if (quiet < 2) { char buf[512]; sprintf(buf,"%s(%d): Chunk error %d--dying....\n", fn, Line, what); pemerror(what,buf); } CIS.CCom.SetStat(dead); _endthread(); // fatal error--only destructors do // cleanup } /* End if */ }; Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net https://hobbs-eo.com