How to Deploy Local LLMs for Embedded Software Development: Terminology and Motivation
In this blog post series, I walk you through creating a fully local, offline AI pipeline. In this first post, I outline the motivation and relevant terminology that are important before we dive into hardware selection and implementation of the pipeline.
Your architecture was decided before you opened the schematic
Engineering teams often treat requirements as a simple feature checklist, but they actually hold the blueprint for your software architecture. By analyzing constraints collectively rather than in isolation, you can define critical architectural patterns—such as task scheduling and abstraction levels—long before the first schematic is drawn. This proactive approach eliminates wasted complexity, reduces development time, and allows software needs to inform hardware choices early in the cycle. Discover how to shift your design mindset to build lean, purposeful systems that align perfectly with business objectives from day one.
GNU Linker Scripts. Part 1. .data, .bss, and the Startup Contract
Before your first line of C code executes, your system must establish a vital memory contract. Discover how the GNU Linker manages the transition from power-on to a ready-to-run state by deconstructing the roles of .data and .bss sections. Learn how to map Virtual and Load Memory Addresses effectively and decode the startup routines that initialize your global variables. By mastering these fundamental linker script mechanics, you gain total control over your embedded application's memory layout and ensure your startup code performs reliably every time.
Beyond the Packet: Designing Reliable Serial Communication for Embedded Systems
Serial communication between microcontrollers sounds simple until the protocol quietly breaks your system. Prabo Semasinghe walks through the design steps for building a robust communication framework: packet structure, error detection, acknowledgment handling, state machine design, and the failure-mode testing that actually proves it works.
I Stopped Testing Embedded Systems by Hand. Here's What Replaced It.
Everardo Garcia walks through the shift from manual, terminal-based system-level testing to automated tests that run during development. He shows how OpenHTF (a framework originally built at Google for manufacturing lines) plus a laptop, a USB cable, and ~150 lines of Python closes the functional testing gap most embedded teams carry, and how spec-driven prompting with GitHub Copilot makes writing plugs and phases fast enough to keep up.
Finite State Machines (FSM) in Embedded Systems (Part 5) - From One FSM to Many
Traditionally, complex systems are implemented using multi-threading and mutexes. Trying to scale up this approach usually results in a nightmare of data races and hidden bugs. A single Finite State Machine may bring order to chaos in applications, but cannot be scaled beyond a limit. In this installment, we explore the Actor Model: a shift from shared state to communicating state machines. Discover how treating FSMs as independent, message-passing entities can eliminate concurrency issues, simplify testing, and improve your embedded architecture.
Your Unit Tests Won't Find the Wolves: Why Embedded Developers Should Be Fuzzing
You test the happy paths. You check the well-formatted packets and the expected inputs. But real users don't read manuals, and real data doesn't follow your protocol spec. Fuzzing throws millions of randomized inputs at your code to find the crashes you never thought to look for. Here's why it matters for embedded systems.
Quickfire Heuristics: A Fast Usability Evaluation Framework for Lean Hardware Teams
That device with the single LED that requires you to count blink patterns just to understand system status. The button you must hold for 8 seconds, which also performs four other actions depending on hold duration. These are not accidents of negligence; they are the predictable output of development processes that have no rigorous usability evaluation component. Usability tends to slip through the gaps of standard engineering reviews, surfacing late, when design flexibility is already gone. This article introduces a framework that adapts Jakob Nielsen's Ten Usability Heuristics, for hardware and embedded systems, translating each principle into concrete evaluation questions for physical interfaces, firmware state machines, constrained displays, and cross-layer interactions. Using a smartwatch as the running example, it also introduces a structured session format, maps the framework to key lifecycle stages, and extends it to manufacturing, test, and field service contexts.
Debug, visualize and test embedded C/C++ through instrumentation
Instrumenting a firmware is a highly effective methodology for debugging and testing an embedded softwares. In this article, I will present a way of achieving this using Scrutiny, an open-source software suite developed as a personal initiative, designed to streamline debugging, telemetry, and hardware-in-the-loop (HIL) testing for embedded devices.
Never use Float or Integer
Ada treats numbers as more than just numbers, and that changes how embedded code fails. This post shows why you should avoid using Float and Integer directly, then demonstrates how distinct types, ranges, and subtypes let the compiler catch unit mix-ups and out-of-range values before runtime. It also shows the same code running on a Raspberry Pi Pico, and briefly introduces SPARK for proving correctness.
Creating a Hardware Abstraction Layer (HAL) in C
In my last post, C to C++: Using Abstract Interfaces to Create Hardware Abstraction Layers (HAL), I discussed how vital hardware abstraction layers are and how to use a C++ abstract interface to create them. You may be thinking, that’s great for C++, but I work in C! How do I create a HAL that can easily swap in and out different drivers? In today’s post, I will walk through exactly how to do that while using the I2C bus as an example.
Modern C++ in Embedded Development: (Don't Fear) The ++
While C is still the language of choice for embedded development, the adoption of C++ has grown steadily. Yet, reservations about dynamic memory allocation and fears of unnecessary code bloat have kept many in the C camp. This discourse aims to explore the intricacies of employing C++ in embedded systems, negotiating the issues of dynamic memory allocation, and exploiting the benefits of C++ offerings like std::array and constexpr. Moreover, it ventures into the details of the zero-overhead principle and the nuanced distinctions between C and C++. The takeaway? Armed with the right knowledge and a careful approach, C++ can indeed serve as a powerful, safer, and more efficient tool for embedded development.
You Don't Need an RTOS (Part 1)
In this first article, we'll compare our two contenders, the superloop and the RTOS. We'll define a few terms that help us describe exactly what functions a scheduler does and why an RTOS can help make certain systems work that wouldn't with a superloop. By the end of this article, you'll be able to: - Measure or calculate the deadlines, periods, and worst-case execution times for each task in your system, - Determine, using either a response-time analysis or a utilization test, if that set of tasks is schedulable using either a superloop or an RTOS, and - Assign RTOS task priorities optimally.
++i and i++ : what’s the difference?
Although the ++ and -- operators are well known, there are facets of their operation and implementation that are less familiar to many developers.
Finite State Machines (FSM) in Embedded Systems (Part 1) - There's a State in This Machine!
An introduction to state machines and their implementation. Working from an intuitive definition of the state machine concept, we will start with a straightforward implementation then we evolve it into a more robust and engineered solution.
Getting Started With Zephyr: Kconfig
In this blog post, we briefly look at Kconfig, one of the core pieces of the Zephyr infrastructure. Kconfig allows embedded software developers to turn specific subsystems on or off within Zephyr efficiently and control their behavior. We also learn how we can practically use Kconfig to control the features of our application using the two most common mechanisms.
Adventures in Signal Processing with Python
Jason Sachs shows how PyLab (numpy, scipy, matplotlib) can handle many signal-processing and visualization tasks engineers usually reach for MATLAB to do. He walks through practical examples including PWM ripple, two pole RC filters, and symbolic math with SymPy, and shares real-world installation tips and trade-offs. The post closes with pointers to IPython and pandas to speed interactive analysis and data handling.
Memory Mapped I/O in C
Interacting with memory mapped device registers is at the base of all embedded development. Let's explore what tools the C language - standard of the industry - provide the developer with to face this task.
3 Tips for Developing Embedded Systems with AI
Explore how to leverage AI in developing embedded systems with three practical tips, learn why documenting your workflows, supercharging testing and debugging, and adopting AI-assisted code generation can save time, reduce errors, and boost performance in your projects, and discover actionable insights to streamline development in resource-constrained environments, this blog explains how to prepare for AI integration while keeping the expertise of experienced engineers intact, offering real-world examples that show how even incremental AI adoption can revolutionize your development process, whether you’re new to AI or seeking to enhance existing practices, these strategies provide a clear roadmap to build smarter, more efficient embedded systems using AI.
How to Deploy Local LLMs for Embedded Software Development: Terminology and Motivation
In this blog post series, I walk you through creating a fully local, offline AI pipeline. In this first post, I outline the motivation and relevant terminology that are important before we dive into hardware selection and implementation of the pipeline.
Introduction to Microcontrollers - Beginnings
Mike Silva's beginner tutorial series walks through core microcontroller concepts and practical steps to get started, from wiring an LED blinky to understanding startup code. He compares embedded and desktop programming, explains why C and assembly matter, and introduces AVR and STM32 Cortex-M3 toolchains and hardware. Expect clear examples, no-nonsense tool advice, and the essential hardware knowledge to move from simulator to a real board.
Analyzing the Linker Map file with a little help from the ELF and the DWARF
Running out of Flash or RAM is a familiar pain for firmware engineers, and the linker map only tells part of the story. This post shows how to combine the linker MAP with ELF symbol tables and DWARF debug info to recover static symbols, sizes, and source files that the map omits. It also describes a C# WinForms viewer that automates the parsing with binutils and helps you spot module and symbol-level memory waste.
MSP430 Launchpad Tutorial - Part 2 - Interrupts and timers
Interrupts let the MSP430 respond to events without wasting CPU time, and this tutorial walks through using TimerA and Port 1 interrupts on the LaunchPad. Enrico shows how to configure TACTL, CCR0 and CCTL0 to generate a periodic TimerA interrupt, and how to set up P1IE, P1IES and P1IFG to catch a button press. The code toggles LEDs and enters LPM0 while waiting for interrupts.
Chebyshev Approximation and How It Can Help You Save Money, Win Friends, and Influence People
Are expensive math libraries or huge lookup tables eating CPU and flash on your microcontroller? In this practical guide Jason Sachs shows how Chebyshev polynomial approximation (with range reduction, splitting, and small interpolated tables) can give near-minimax accuracy while using far less code and runtime. The post compares Taylor series, plain and interpolated tables, and explains how to fit empirical sensor data and evaluate coefficients efficiently.
MSP430 LaunchPad Tutorial - Part 4 - UART Transmission
Want to stream sensor or debug data from an MSP430 LaunchPad to a PC or Bluetooth module? Enrico swaps in an MSP430G2553 and shows how to configure SMCLK, P1 pin multiplexing, and UCA0 baud/dividers (with modulation) to approximate 115200 baud. The post also walks through interrupt-driven RX/TX handling and a low-power wait loop that sends a "Hello World" reply on demand.
Understanding and Preventing Overflow (I Had Too Much to Add Last Night)
Integer overflow is stealthier than you think, and in embedded systems it can break control loops or corrupt data. Jason Sachs walks through the usual culprits, including addition, subtraction, multiplication, shifting and Q15 fixed-point traps, plus C-specific pitfalls such as undefined signed overflow and INT_MIN edge cases. He then lays out practical defenses: prefer fixed-width types, widen and saturate intermediates, enable wraparound where appropriate, and reason about modular congruence for compound arithmetic.
Adventures in Signal Processing with Python
Jason Sachs shows how PyLab (numpy, scipy, matplotlib) can handle many signal-processing and visualization tasks engineers usually reach for MATLAB to do. He walks through practical examples including PWM ripple, two pole RC filters, and symbolic math with SymPy, and shares real-world installation tips and trade-offs. The post closes with pointers to IPython and pandas to speed interactive analysis and data handling.
MSP430 Launchpad Tutorial - Part 1 - Basics
A working button-driven LED on the MSP430 LaunchPad is only a few steps away. Enrico Garante walks through creating a CCS project, setting P1.0 as the LED output and enabling P1.3 button interrupts, then shows the interrupt service routine that toggles the LED. The short tutorial covers stopping the watchdog, configuring P1DIR/P1OUT, clearing flags, and launching the code so you can get blinking quickly.
Coroutines in one page of C
Yossi Kreinin shows how to get usable coroutines in plain C by combining setjmp/longjmp with a bit of inline assembly. The post walks through a working iterator example, explains why you must allocate and switch a separate stack, and outlines the start/yield/next API. It also flags portability pitfalls like stack growth direction and frame pointers, and points to makecontext and Tony Finch alternatives.
Important Programming Concepts (Even on Embedded Systems) Part I: Idempotence
Idempotence is a simple design principle that prevents duplicate effects when operations are retried or repeated. Jason Sachs shows why it matters in embedded systems, from HTTP submit buttons and capacitive touch inputs to garage-door remotes and SPI DAC writes. Read this post to learn three practical idempotent techniques and when redundant writes are a sensible reliability trade-off.


























