How to Deploy Local LLMs for Embedded Software Development: Terminology and Motivation
In this blog post series, I walk you through creating a fully local, offline AI pipeline. In this first post, I outline the motivation and relevant terminology that are important before we dive into hardware selection and implementation of the pipeline.
Your architecture was decided before you opened the schematic
Engineering teams often treat requirements as a simple feature checklist, but they actually hold the blueprint for your software architecture. By analyzing constraints collectively rather than in isolation, you can define critical architectural patterns—such as task scheduling and abstraction levels—long before the first schematic is drawn. This proactive approach eliminates wasted complexity, reduces development time, and allows software needs to inform hardware choices early in the cycle. Discover how to shift your design mindset to build lean, purposeful systems that align perfectly with business objectives from day one.
Finite State Machines (FSM) in Embedded Systems (Part 5) - From One FSM to Many
Traditionally, complex systems are implemented using multi-threading and mutexes. Trying to scale up this approach usually results in a nightmare of data races and hidden bugs. A single Finite State Machine may bring order to chaos in applications, but cannot be scaled beyond a limit. In this installment, we explore the Actor Model: a shift from shared state to communicating state machines. Discover how treating FSMs as independent, message-passing entities can eliminate concurrency issues, simplify testing, and improve your embedded architecture.
Debug, visualize and test embedded C/C++ through instrumentation
Instrumenting a firmware is a highly effective methodology for debugging and testing an embedded softwares. In this article, I will present a way of achieving this using Scrutiny, an open-source software suite developed as a personal initiative, designed to streamline debugging, telemetry, and hardware-in-the-loop (HIL) testing for embedded devices.
Can an RTOS be really real-time?
Real-Time Operating Systems are meant for real-time applications. But with conventional shared-state concurrency and blocking, can you honestly know the worst-case execution time of an RTOS thread?
Getting Started With Zephyr: DTS vs DTSI vs Overlays
Devicetrees can be daunting for traditional embedded software engineers that are new to Zephyr. In this blog post, I address these fears and show how navigating Devicetrees can be much easier if you understand that they represent the layered structure of the underlying hardware.
Why Containers Are the Cheat Code for Embedded DevOps
Embedded software teams have long accepted toolchain setup as “part of the job,” but it’s a hidden productivity killer. Manual installs waste days, slow onboarding, and derail CI pipelines with “works on my machine” issues. While enterprise software solved this years ago with containerization, many embedded teams are still stuck replicating fragile environments. Containers offer a proven fix: a portable, reproducible build environment that works identically on laptops and CI servers. No brittle scripts, mismatched versions, or wasted time—just code that builds. IAR has gone further by delivering pre-built, performance-tuned Docker images for Arm, RISC-V, and Renesas architectures, ready for GitHub Actions and CI/CD pipelines. For regulated industries, containers simplify audits and compliance by enabling validation once and reuse everywhere. The result: faster onboarding, consistent builds, and stronger safety assurance. Containers aren’t a luxury—they’re the cheat code embedded teams need to modernize DevOps and compete effectively.
How to Achieve Deterministic Behavior in Real-Time Embedded Systems
Ensuring deterministic behavior in real-time embedded systems is paramount for their reliability and performance. The ability to predict precisely how a system will respond to various inputs at any given time is crucial in critical applications such as medical devices, aerospace systems, and automotive safety mechanisms. Achieving deterministic behavior involves meticulous design, stringent testing, and adherence to strict timing constraints.
Hidden Gems from the Embedded Online Conference Archives - Part 3
Jack Ganssle shows us what we can learn by studying previous failures - and why this is essential for anyone working in embedded systems.
Finite State Machines (FSM) in Embedded Systems (Part 1) - There's a State in This Machine!
An introduction to state machines and their implementation. Working from an intuitive definition of the state machine concept, we will start with a straightforward implementation then we evolve it into a more robust and engineered solution.
So You Want To Be An Embedded Systems Developer
This is a practical, boots-on-the-ground roadmap of books, videos, and inexpensive dev boards you can actually use to become an embedded systems developer. It contrasts hobbyist platforms like Arduino and Raspberry Pi with professional ARM-based evaluation kits, lists must-read resources for firmware, real-time systems, and testing, and emphasizes hands-on practice and the safety responsibilities of working with real-world devices.
Analyzing the Linker Map file with a little help from the ELF and the DWARF
Running out of Flash or RAM is a familiar pain for firmware engineers, and the linker map only tells part of the story. This post shows how to combine the linker MAP with ELF symbol tables and DWARF debug info to recover static symbols, sizes, and source files that the map omits. It also describes a C# WinForms viewer that automates the parsing with binutils and helps you spot module and symbol-level memory waste.
++i and i++ : what’s the difference?
Although the ++ and -- operators are well known, there are facets of their operation and implementation that are less familiar to many developers.
Embedded Systems Roadmaps
What skills should every embedded systems engineer have? What should you study next to improve yourself as an embedded systems engineer? In this article I'll share with you a few lists from well-respected sources that seek to answer these questions, with the hope of helping provide you a path to mastery. Whether you've only just finished your first Arduino project or you've been building embedded systems for decades, I believe there's something in here for everyone to help improve themselves as embedded systems engineers.
Chebyshev Approximation and How It Can Help You Save Money, Win Friends, and Influence People
Are expensive math libraries or huge lookup tables eating CPU and flash on your microcontroller? In this practical guide Jason Sachs shows how Chebyshev polynomial approximation (with range reduction, splitting, and small interpolated tables) can give near-minimax accuracy while using far less code and runtime. The post compares Taylor series, plain and interpolated tables, and explains how to fit empirical sensor data and evaluate coefficients efficiently.
Finite State Machines (FSM) in Embedded Systems (Part 5) - From One FSM to Many
Traditionally, complex systems are implemented using multi-threading and mutexes. Trying to scale up this approach usually results in a nightmare of data races and hidden bugs. A single Finite State Machine may bring order to chaos in applications, but cannot be scaled beyond a limit. In this installment, we explore the Actor Model: a shift from shared state to communicating state machines. Discover how treating FSMs as independent, message-passing entities can eliminate concurrency issues, simplify testing, and improve your embedded architecture.
Understanding and Preventing Overflow (I Had Too Much to Add Last Night)
Integer overflow is stealthier than you think, and in embedded systems it can break control loops or corrupt data. Jason Sachs walks through the usual culprits, including addition, subtraction, multiplication, shifting and Q15 fixed-point traps, plus C-specific pitfalls such as undefined signed overflow and INT_MIN edge cases. He then lays out practical defenses: prefer fixed-width types, widen and saturate intermediates, enable wraparound where appropriate, and reason about modular congruence for compound arithmetic.
VHDL tutorial - A practical example - part 2 - VHDL coding
Gene Breniman walks through the VHDL coding for a CPLD-based data acquisition engine, turning the hardware spec into a working state machine and signal generators. The article explains SPI and I2S timing choices, an internal SPI peripheral latch, and counter-based timing (seqCount and CycleCnt) used to create LRCK, BCK, SPI SCK and nvSRAM write control. It’s a practical, implementation-focused guide for embedded designers.
3 Tips for Developing Embedded Systems with AI
Explore how to leverage AI in developing embedded systems with three practical tips, learn why documenting your workflows, supercharging testing and debugging, and adopting AI-assisted code generation can save time, reduce errors, and boost performance in your projects, and discover actionable insights to streamline development in resource-constrained environments, this blog explains how to prepare for AI integration while keeping the expertise of experienced engineers intact, offering real-world examples that show how even incremental AI adoption can revolutionize your development process, whether you’re new to AI or seeking to enhance existing practices, these strategies provide a clear roadmap to build smarter, more efficient embedded systems using AI.
Analyzing the Linker Map file with a little help from the ELF and the DWARF
Running out of Flash or RAM is a familiar pain for firmware engineers, and the linker map only tells part of the story. This post shows how to combine the linker MAP with ELF symbol tables and DWARF debug info to recover static symbols, sizes, and source files that the map omits. It also describes a C# WinForms viewer that automates the parsing with binutils and helps you spot module and symbol-level memory waste.
Chebyshev Approximation and How It Can Help You Save Money, Win Friends, and Influence People
Are expensive math libraries or huge lookup tables eating CPU and flash on your microcontroller? In this practical guide Jason Sachs shows how Chebyshev polynomial approximation (with range reduction, splitting, and small interpolated tables) can give near-minimax accuracy while using far less code and runtime. The post compares Taylor series, plain and interpolated tables, and explains how to fit empirical sensor data and evaluate coefficients efficiently.
Understanding and Preventing Overflow (I Had Too Much to Add Last Night)
Integer overflow is stealthier than you think, and in embedded systems it can break control loops or corrupt data. Jason Sachs walks through the usual culprits, including addition, subtraction, multiplication, shifting and Q15 fixed-point traps, plus C-specific pitfalls such as undefined signed overflow and INT_MIN edge cases. He then lays out practical defenses: prefer fixed-width types, widen and saturate intermediates, enable wraparound where appropriate, and reason about modular congruence for compound arithmetic.
So You Want To Be An Embedded Systems Developer
This is a practical, boots-on-the-ground roadmap of books, videos, and inexpensive dev boards you can actually use to become an embedded systems developer. It contrasts hobbyist platforms like Arduino and Raspberry Pi with professional ARM-based evaluation kits, lists must-read resources for firmware, real-time systems, and testing, and emphasizes hands-on practice and the safety responsibilities of working with real-world devices.
Zebras Hate You For No Reason: Why Amdahl's Law is Misleading in a World of Cats (And Maybe in Ours Too)
Amdahl’s Law is a useful warning, but Jason Sachs argues it can be misleading if you stop at the equation. Using the Kittens Game as a playful model, he shows how Gustafson’s perspective, positive feedback loops, and system-level synergy can turn modest component speedups into big real-world wins. The article closes with concrete embedded-systems examples like ISR timing and developer productivity.
Using the Beaglebone PRU to achieve realtime at low cost
Fabien Le Mentec shows how the BeagleBone Black's PRU coprocessors can run hard realtime control loops, removing the need for an FPGA or dedicated microcontroller. He walks through Linux setup, device tree enabling, assembler and loader tools, and a timer example that reads ADCs and drives PWM from PRU code. The post highlights community SDKs and a recent TI Code Composer Studio option for C-based PRU development.
Coroutines in one page of C
Yossi Kreinin shows how to get usable coroutines in plain C by combining setjmp/longjmp with a bit of inline assembly. The post walks through a working iterator example, explains why you must allocate and switch a separate stack, and outlines the start/yield/next API. It also flags portability pitfalls like stack growth direction and frame pointers, and points to makecontext and Tony Finch alternatives.
Important Programming Concepts (Even on Embedded Systems) Part I: Idempotence
Idempotence is a simple design principle that prevents duplicate effects when operations are retried or repeated. Jason Sachs shows why it matters in embedded systems, from HTTP submit buttons and capacitive touch inputs to garage-door remotes and SPI DAC writes. Read this post to learn three practical idempotent techniques and when redundant writes are a sensible reliability trade-off.
VHDL tutorial - A practical example - part 2 - VHDL coding
Gene Breniman walks through the VHDL coding for a CPLD-based data acquisition engine, turning the hardware spec into a working state machine and signal generators. The article explains SPI and I2S timing choices, an internal SPI peripheral latch, and counter-based timing (seqCount and CycleCnt) used to create LRCK, BCK, SPI SCK and nvSRAM write control. It’s a practical, implementation-focused guide for embedded designers.
Using the C language to program the am335x PRU
Assembly-language PRU development is tedious and error prone, so Fabien Le Mentec shows how to use TI's PRU C toolchain to simplify the workflow. He walks through installing the CGT package, integrating the compiler with a modified prussdrv loader to honor the _c_int00 start symbol, and provides a BeagleBone Black example with build scripts and sources on GitHub. The post also covers inline assembly constraints and code-size tradeoffs.






















