Global Variables vs. Safe Software
It seems that Reddit's programming and technology subreddits have only recently caught up to 2013. If you take a look, you'll find plenty of discussion and controversy surrounding a 2013 blog post from Safety Research & Strategies Inc. regarding the software failings behind Toyota's unintended acceleration problems.
The article is a good overview, but for full credit you should read the writeups, presentations and testimony from Michael Barr - the engineer who lead the team which analyzed Toyota's code. The long and short of it is that through analysis of the source code responsible for throttle control the team identified and demonstrated a series of steps that would cause a software failure that induces the unintended acceleration seen in Toyota cars. His testimony helped convince the jury to rule against Toyota and lead the company to settle with the plaintiffs before any punitive damages could be awarded in the case.
Mr. Barr weaved a disturbing pictures of lax or nonexistent software standards, improper use of redundancy, incorrect watchdog timer handling, bad software analysis and overloaded processors being pushed to their limits and beyond. It's a chilling and sobering read with disturbing implications for anyone who writes software, cares about safety or drives a car. It should make us question how software which can harm us should be written:
- Should there be a certification authority for automotive software?
- Should all safety-critical software be open-source?
- Should software even be used in such critical applications, or should we rely on mechanical or hardware methods instead?
While there is some discussion of these salient topics I was surprised that most of the discussion is about the article's claim that the code had '10,000 global variables'. Many people are confused - is this certainly a bad thing or merely potentially a bad thing? Or is it a good thing? Is it common in embedded systems? Or maybe just common in safety-critical systems? There's a lot of confusion on how to judge the claim and what the actual upshot of it is and since I have some experience in both embedded systems and safety-critical systems I thought I'd throw my two cents in on the topic and address some of the assertions made on Reddit. I'll deconstruct some of my favorites and hopefully clear the air a bit.
'It's common for all variables to be global in embedded systems/safety-critical systems'
First, I don't know that it's 'common' at all. I have seen this practice used before in embedded systems but my impression is that it's far from ubiquitous. In any case, 'common' is hardly a defense for awful design - especially in safety-critical systems. There's absolutely no protection for global variables: any code anywhere in the entire codebase can modify them at will. It's a nightmare: instead of one piece of code doing one thing, every piece of code does everything. Systems like this are nearly impossible to document, debug, extend or fix bugs in.
Some argue that using only global variables aids in debugging embedded systems. Sadly, this has truth to it: debuggers for embedded systems range from good to awful with most of them tending towards awful. Some of them have particularly bad problems with scoped variables which make using file-local or function-local variables a dicey proposition. I'm certainly sympathetic to embedded developers in this position, but making everything a global variable is a poor fix. Breaking the software down into simpler functional units and implemented unit tests and module-level tests is a much better approach than fundamentally breaking your design.
'Global variables are preferable to dynamic memory allocation'
This point doesn't even make sense. The opposite of dynamic memory allocation is static memory allocation, not global variables. While it's true that all global variables are statically allocated, so are file-local variables and static function-local variables. There are plenty of options to avoid dynamic allocation without making everything a global variable.
'Global variables are preferable to stack-allocated variables'
This one has some truth to it - but only some. Every function-local variable is allocated on the stack and the stack has a maximum depth that it must not exceed. Thus, if you have an excessive number of function-local variables, you might run out of room.
But this problem should absolutely not sneak up on you. One the analyses that you're supposed to perform when writing safety-critical software is the stack depth analysis. You count up the number of nested function calls, local variables and everything else that will be added to the stack and see if it exceeds the stack's maximum depth. If it does then you need to redesign your software to use less stack. Making everything a global variable does use less stack space, but there are certainly other approaches.
In any case, Barr testified that Toyota wasn't performing stack depth analysis correctly (they were underestimating the stack usage) so it's doubtful their over reliance on global variables was primarily a result of issues with their stack depth. On top of that, Toyota had even bigger problems with stack depth because their code utilized recursion! Recursion has the potential to quickly consume the entire stack, so if they had issues then they could have started there and not with their local variables.
'It was probably autogenerated code, so it's fine'
This is very frustrating because I've worked on safety-critical projects where we used autogenerated code. You don't just get a pass on sloppy code because you didn't write it. It still needs requirements, it still has to meet your coding standards, it still has to be tested and you still need documentation for it. The same goes for code that comes from a third-party library. You are required to 'own' every line of code that goes into the final product and if it isn't up to standards, it doesn't fly. Period. If your code generator writes bad code it's no different than if a human writes bad code.
'MISRA is not a realistic coding standard anyway'
MISRA is a coding standard that is popular for use in automotive applications. It forbids some aspects of C that tend to produce unsafe code and places restrictions on others - global variables included. It's widely seen as a good basis for writing safety-critical code. Note that I said 'good basis' and not 'final and ultimate set of rules for writing any code ever'. There are certainly aspects to MISRA or any other general coding standard that may be impossible or undesirable to apply on any given project. What a good developer or company does in these situations is to note the deviation and provide a justification. What a poor developer or company does is chuck the whole standard out (and in some cases, the whole idea of a coding standard) and write whatever code pleases them.
If Toyota didn't like MISRA they could have justified their deviations or written their own coding standard which allowed them to write code any way they pleased. They didn't do either of these. They paid lip service to MISRA but didn't take steps to enforce compliance with it. You can argue against MISRA all you want, but the fact of the matter is that Toyota simply didn't have the processes to follow any coding standard. That is the real problem, not MISRA's strictness.
In my opinion, 10,000 global variables is a significant 'code smell' and in the case of Toyota it doesn't take too much to find the source of the smell: no enforced software development standards or processes. This is the real story. It's a sad state of affairs when there's more discussion of good software design and the appropriate role of global variables on Reddit than within the halls of Toyota.
To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.
Registering will allow you to participate to the forums on ALL the related sites and give you access to all pdf downloads.