Review: Embedded Software Design: A Practical Approach to Architecture, Processes, and Coding Techniques
Full disclosure: I was given a copy of this book to review.
Embedded Software Design: A Practical Approach to Architecture, Processes, and Coding Techniques, by Jacob Beningo, is an excellent introduction to strategies for embedded systems design and bringing those designs to fruition. Renowned embedded systems expert Jack Ganssle was the technical reviewer.
This is a practical how-to book on the modern professional practice of embedded systems software development. Where other books focus on the low-level technical details of hardware, RTOS's, and software, this one steps back to a broader view, focusing on the higher level considerations that turn those details into systems. Both perspectives are important.
The book covers a lot of ground, so there's something for embedded developers at all levels, from beginners through experienced professionals.
It's organized around what Beningo calls the embedded software triad:
- Software architecture and design
- Agile, DevOps, and processes
- Development and coding skills
These elements interact and overlap. Getting a good balance between them is critical to successful delivery of products. Out-of-balance environments result in late or inconsistent software, possibly never completed, with quality issues and lots of rework.
The book is split into 4 parts, one for each element of the triad, and a final one on next steps. Each part delves into specific details, outlining principles and then offering practical methods.
The appendices include two outstanding extended practical examples to help apply the principles.
The level of detail is mixed. You can treat some items as full recipes to follow, while others are more introductory, providing guidance for further study. It's important to remember that no single book can provide all the answers for all possible cases, because the field is too broad. But the book offers enough specific examples to illustrate the direction to go, following the principles for a disciplined approach. Tailor them to your specific industry, project, runtime environment, and development/deployment environment.
One very important focus of the book that's been largely overlooked by the industry is security. The book puts it front and center. Security needs to be an integral part of architecture and design from the beginning; it's not something that can be retrofitted after the fact.
The biggest value this book provides is a word I use a lot: discipline. Whether you find everything you need in the book or have to look further, it lays out a disciplined approach to follow rather than ad hoc hope and pray. Given the potential consequences of embedded systems failures, with the potential to ruin lives, that's critical.
Part I: Software Architecture and Design
This lays out a set of 7 modern software design philosophy principles that serve as touchpoints for the rest of the book. This is important because embedded systems have shifted from the largely hardware-centric designs of the past to much more complex software-centric designs. The philosophy provides methods for managing that complexity.
This discusses approaches to architecture and characteristics of good architecture, including coupling and cohesion. It examines various architectures for their maintainability, scalability, and portability. The latter consideration has proved especially important given recent supply chain difficulties that have required swapping out hardware.
One very interesting aspect is application domains, decomposing application into different execution regions:
- Privilege domain: the registers and memory a section of code is allowed to access.
- Security domain: isolating code into secure and non-secure processing environments (SPE and NSPE).
- Execution domain: which cores code runs on.
- Cloud domain: what processing can be offloaded to the cloud.
This focuses on security, important for protecting intellectual property, user data, and brand identity. It's also become increasingly important for legal and regulatory reasons.
Beningo uses the example of the ARM Platform Security Architecture (PSA). This includes:
- Threat models and analysis documentation.
- Hardware and firmware specifications.
- An open source reference implementation.
- An independent security evaluation scheme.
PSA uses four stages to build secure systems:
- Analysis of data assets, threat models, and vulnerabilities.
- Architecting the system, including hardware selection and software.
- Implementation using things like TrustZone, multicore processors, and secure boot.
- Certifying that the system meets its security requirements.
This covers RTOS application design, making use of the facilities offered by an RTOS. Real-time systems require not just correct computation, but also timely response within a deadline. The right answer, delivered too late, is a failure.
Beningo discusses task, thread, and process decomposition and prioritization. It can be unclear how to divide up a system into these concurrent components and set their execution priorities, so he offers practical methods for approaching them.
This outlines design patterns for real-time systems. This includes a number of patterns for single and multicore development, publish/subscribe models, RTOS resource and activity synchronization, interrupt handling, and low power consumption.
Part II: Agile, DevOps, and Processes
This discusses software quality, metrics, and processes. Ignoring these produces bad outcomes.
Quality is a multifaceted attribute of a system. It's much more than just functional quality. It's also the structural quality, which affects the system's robustness, scalability, and maintainability. Systems evolve over time, gaining new features and complexities and growing in deployment scale. Long-term success depends on good structural quality, which then drives good functional quality.
Metrics provide practical ways to measure quality, and processes apply them to catch bugs and defects early while they are cheap and easy to fix.
This covers DevOps for embedded systems. DevOps integrates software development with software operations. The goal is to increase the efficiency, speed, and security of the development and delivery process.
A key component in applying DevOps principles is the concept of the delivery pipeline. This is the process that feeds development into delivery, with continuous feedback along the way. A variety of mechanisms automate portions of the pipeline.
Beningo outlines 5 parts to a delivery pipeline:
- Source control
- A CI/CD framework for managing pipelines
- Build automation tools
- Code testing and analysis frameworks
- Review, approval, and deployment tools
There are two aspects of DevOps where there are differing opinions worth discussing here: source control branching, and continuous delivery (CD).
Beningo advocates feature branching, which keeps feature development separate from the main source code repository until it's ready to be pulled in. The idea is to isolate the changes so that they don't affect other developers until it's time to integrate them.
A different method that's common in the literature is trunk-based development, where all code is immediately incorporated into the main trunk of the repository. The idea is to expose changes to other developers as soon as possible so that they can integrate them immediately and detect issues sooner. "Feature flags" (aka "feature toggles") can be used to mitigate risk by selectively enabling and disabling feature capabilities at build or run time.
There are pros and cons to each approach. Isolating on branches means that it can be a big and painful job to integrate the changes once the time comes, and other developers may have made decisions in the meantime that conflict with the new code. Integrating immediately means that new code that is not yet ready for delivery may disrupt other development, and the set of feature flags can get complex to manage and test.
In embedded systems development, feature branches can offer more controlled integration, which may be required in some environments and industries. Painful integration can be mitigated by keeping branches short-lived (hours or days rather than weeks or months), in small increments of functionality, possibly including careful use of feature flags. This achieves some of the benefits of trunk-based development.
The second aspect relates to CI/CD pipelines that automate the delivery process. How far should the pipeline go? Continuous Integration (CI) automatically integrates developer changes into the build continuously (tying into trunk-based development and feature branching) to get feedback on them as quickly as possible. Continuous Deployment (CD) automatically deploys the changes that successfully pass all the automated building, validation, and testing stages.
For non-embedded systems, CD deploys changes to the production environment. But that generally isn't appropriate or practical for embedded systems. While a backend server in the cloud may be deployed multiple times per day, an embedded system needs to be more stable. You don't want to be updating embedded devices that frequently; that's a recipe for chaos.
For some systems, it's physically impossible, since they can only be flashed during manufacturing. For those that support Over The Air (OTA) updates, you don't want to be consuming the communications bandwidth to update thousands or millions of fielded devices continuously.
However, Continuous Deployment still makes sense if you consider the deployment target to be the final environment just prior to delivery to production devices. Then you have vetted releases ready to go at all times. When you decide it's time to provide an update to manufacturing or via OTA, you release it for distribution.
There may also be cases where automated testing in the pipeline is unable to do everything you need to do:
- The physical nature of embedded systems means there may be physical interactions that have to be tested by other means. Some of these tests may be very expensive and complex to conduct, such as test-firing a rocket motor control system.
- You may want to do beta testing with a set of beta test users.
- Regulated industries may require a final certification process.
CD can deploy changes continuously into an artifact repository that makes them available to that physical test, beta test, or certification environment.
The point is, don't think you can't benefit from Continuous Deployment just because you're working on an embedded system. You can; it's just a matter of how far that goes into delivering software into the hands of end users.
This covers testing, verification, and Test-Driven Development (TDD). Beningo discusses the different types of testing, what makes tests good, and how many tests are appropriate based on cyclomatic code complexity.
He introduces TDD for embedded systems and shows how to set up a test harness for it. TDD is an outstanding method for developing software (don't think of it as a test methodology, think of it as a development methodology, driven incrementally by tests that prove out actual results, which happens to produce a set of tests as an output alongside the production code).
One of the particular advantages of TDD for embedded systems is that it allows testing of code even when the target hardware isn't yet available. By the time you integrate things on the target hardware, or some representative version of it, you're working with known-good software components.
That significantly cuts down the problem space and the amount of work you have to do debugging on the target. Rather than dealing with 100% unproven code on the target, you're dealing with 90-99% proven code, and just the last 1-10% of target-specific integration code.
For more on TDD, see my blog posts Unit Testing For Embedded Software Development and Acceptance Tests vs. TDD.
This covers modeling, simulation, and deployment. The primary modeling methods are UML using various tools, and Matlab.
Modeling is a great way to explore and then communicate system details. More powerful tools can even generate code from models.
Simulation (again in Matlab) then allows executing models to exercise the various scenarios. This allows rapid prototyping using abstractions that can be modified easily.
The section on deployment reiterates some of the details of chapter 7.
This covers defect minimization. It explores some of the sources of defects and outlines a set of phases that each provide additional points to identify and eliminate them.
What I really like about the phases is that they provide a disciplined, complete approach that incorporates all the tools and methods available to bring the full due diligence to the process.
Phase 4 discusses documentation facility setup. Documentation is a never-ending argument in software development. How much, in what form, and where? In general, I like a multi-dimensional, multi-media approach. Models, diagrams, comments, and element names (the DAMP principal: Descriptive And Meaninful Phrases for function names, variable names, field names in structures, constants, etc.) all work together to convey information.
Yes, it takes effort to keep them in sync, and it's easy for them to get out of hand. That's part of the work of engineering.
But documentation beyond just the raw code helps maintain a long life for a system as many people touch it, part of the discipline. Just relying on the "self-documenting code" is a recipe for disaster. Related to that, Beningo wrote a nice article some time ago on 10 Tricks for Documenting Embedded Software.
Part III: Development and Coding Skills
This discusses microcontroller selection. This used to be a very hardware-centric process, but has become much more software-centric. Beningo outlines a 7-step selection process that evaluates the various hardware and software considerations, and then a final selection matrix for making the selection decision.
This chapter is particularly valuable for those who have never had to perform microcontroller selection before.
This discusses software interfaces, contracts, and assertions. Interfaces are a critical element of design, establishing contracts between caller and callee. Assertions are a mechanism for enforcing contracts, but have some particular challenges in real-time systems.
Beningo covers design-by-contract as a way to specify the preconditions, side effects, post conditions, and invariants of a called interface. He goes into the practical details of defining and using assertions to enforce these, and ways of handling assertion failures in real-time systems.
This covers techniques for configurable software. This allows writing reusable, scalable code that can be incorporated in a number of applications, rather than repeatedly reimplementing the same code.
This is driven by using configuration tables that are easy to maintain and adapt to different systems. Beningo shows examples of GPIO and task configuration tables that centralize all the details of handling them.
This can be further leveraged using configuration files and scripts to autogenerate code. Beningo extends the task configuration example to show configuration files, source code templates, and a Python script that generates the code from them.
This covers communications with the external world, command processing, and telemetry.
Beningo shows a lightweight communications protocol suitable for UART, USB, WiFi, etc. Then he shows packet reception, parsing, and validation, followed by table-driven command processing. For sending data off-device, such as status or data streaming, he shows two architectures for telemetry.
This covers tools for embedded systems development. It discusses using the right tools, and making an appropriate investment to leverage tool value for better, faster development.
Beningo discusses open source vs. commercial tools, and provides recommendations for a number of tools for architecting systems, managing development processes, and implementing systems (including measuring and evaluating them).
One of his main points is that organizations need to be willing to pay for good tools. While there are many useful open source tools that are free, there are commercial tools that are worthwhile and will provide significant value. Don't make tool choices on cost alone. Over the full scope of a project, that can be a costly mistake.
Part IV: Next Steps and Appendixes
This summarizes the embedded software triad and discusses some next steps for applying the methods in the book.
This provides a number of security terminology definitions, a good way to get introduced to an essential topic.
This reiterates the 12 core agile principles of the Agile Manifesto, since many teams think they are "doing Agile" but are not actually doing it.
This is an outstanding extended hands-on exercise showing CI/CD using GitLab. This might be worth the price of the book alone. If you've accepted the principles in the book, this gets you rolling on applying a number of them.
This is another outstanding extended hands-on exercise showing TDD. It also might be worth the price of the book! It goes through developing a heater module using TDD, including using pmccabe and gcov to measure McCabe Cyclomatic Complexity (MCC) and test coverage to determine when enough testing has been done.
One thing I would add to this is the use of Behavior-Driven Development test naming and organization for test cases, so that the test names form a human-readable specification. This follows the "given/when/then" or "given/should/when" form, where test groups form the "given" conditions, and tests verify what the code "should then" do "when" some test condition occurs. The Setup function for the group sets the given conditions, establishing the scenario.
In addition to providing a readable specification, this forces you to focus on testing via the interface to the module, not testing the implementation. By testing to the interface, i.e. testing the behavior, you have robust tests that will be effective even if you change the implementation. By testing the implementation, you create brittle tests that break easily and require lots of maintenance when you change the implementation.
For example, this is the set of test cases when I used TDD to implement a coding challenge to develop an FSM for an elevator (you can see the whole thing at https://github.com/stevebranam/elevator-fsm; note that I am not an expert on elevator systems!):
Given_StoppedElevator.Should_BeIdle_When_NoActivity Given_StoppedElevator.Should_NotBeIdle_When_SameFloorRequested Given_StoppedElevator.Should_OpenDoor_When_SameFloorRequested Given_StoppedElevator.Should_OpenDoor_When_OpenButtonPushed Given_StoppedElevator.Should_MoveToFloor_When_NewFloorRequested Given_StoppedElevator.Should_GoOutOfService_When_SameFloorRequestedAndDoorTimesOut Given_StoppedElevator.Should_GoOutOfService_When_SameFloorRequestedAndDoorFaults Given_MovingElevator.Should_BeWaiting_When_Arrived Given_MovingElevator.Should_Stop_When_StopButtonPushed Given_MovingElevator.Should_Resume_When_StopButtonPushedTwice Given_MovingElevator.Should_GoOutOfService_When_DriveTimesOut Given_MovingElevator.Should_GoOutOfService_When_DriveFaults Given_WaitingElevator.Should_KeepDoorOpen_When_OpenButtonPushed Given_WaitingElevator.Should_CloseDoor_When_TimerExpires Given_WaitingElevator.Should_CloseDoor_When_CloseButtonPushed Given_WaitingElevator.Should_BeIdle_When_DoorCloses Given_OutOfServiceElevator.Should_BeWaiting_When_ServiceRestoredAtGroundFloor Given_OutOfServiceElevator.Should_MoveToGround_When_ServiceRestoredOffGroundFloor Given_OutOfServiceElevator.Should_MoveToGround_When_ServiceRestoredAtOtherFloor
This is an excellent book that will help you bring discipline to the development of embedded systems. That means better, more robust systems, and more effective development processes.
- Write a Comment Select to add a comment
To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.
Please login (on the right) if you already have an account on this platform.
Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: