A linker is a build tool that combines one or more compiled object files and static libraries into a single executable binary, resolving symbolic references between translation units and assigning final memory addresses to code and data sections.
In practice
In embedded development, the linker does more than just stitch object files together. It uses a linker script (also called a scatter file on ARM/Keil toolchains) to map sections like .text, .data, .bss, and custom segments to specific regions of the target's memory map. Without a correct linker script, code intended for flash may land in the wrong address range, initialized data may not be copied to RAM at startup, or the stack and heap may overlap. Writing or modifying a linker script is one of the first tasks when bringing up a new board or MCU variant.
The linker produces a binary image (ELF, HEX, SREC, or raw BIN depending on toolchain and target) and typically a map file. The map file lists the final address and size of every symbol and section, making it indispensable for diagnosing why a build exceeds flash or RAM limits. The blog post "Analyzing the Linker Map file with a little help from the ELF and the DWARF" goes into detail on reading this output.
Common pitfalls include missing or mismatched section attributes (for example, forgetting NOLOAD on a .noinit section causes the startup code to zero-initialize memory that was intentionally preserved across resets), incorrect ORIGIN or LENGTH values that silently allow overlapping regions, and unresolved symbols that only surface at link time rather than compile time. When using C++ on microcontrollers, the linker script must also account for constructors and destructors listed in .init_array and .fini_array; the blog post "C++ on microcontrollers 2 - LPCXpresso, LPC-link, Code Sourcery, lpc21isp, linkerscript, LPC1114 startup" covers this in a practical bring-up context.
On most bare-metal projects, the runtime startup file (crt0.s or equivalent) relies on symbols defined by the linker script (such as _etext, _sdata, _edata, _sbss, _ebss) to copy the .data section from flash to RAM and zero out .bss before main() is called. If those symbols are wrong or missing, the program will appear to run but global and static variables will hold garbage values.
Frequently asked
What is a linker script and do I always need one?
A linker script tells the linker where in the memory map to place each section and how large each memory region is. For hosted desktop targets, the toolchain supplies a default script that works against the OS virtual memory model. On bare-metal embedded targets, you almost always need a target-specific linker script because the memory layout (
flash base address,
RAM size, presence of CCM or DTCM regions, etc.) varies per
MCU. Most vendor SDKs and CMSIS device packs ship one; writing your own is necessary when the defaults do not match your hardware.
What is the difference between the linker and the compiler?
The compiler translates a single C or C++ source file into an object file containing machine code and unresolved symbolic references. The linker operates after all compilation is done: it takes the full set of object files and libraries, resolves every cross-file symbol reference, and produces the final addressed binary. Errors like 'undefined reference to foo' are linker errors, not compiler errors, because the compiler only sees one translation unit at a time.
How do I find out what is consuming my flash or RAM?
The linker map file is the primary tool. It lists every object file and library, the sections they contribute, and the final address and size of each symbol. Most toolchains (GCC ld, LLVM lld, IAR ILINK, Keil armlink) generate a map file via a flag such as -Map=output.map or equivalent IDE option. The blog post 'Analyzing the Linker Map file with a little help from the ELF and the DWARF' explains how to interpret the map file alongside ELF section headers and DWARF debug info.
What does '--gc-sections' do and should I use it?
When passed to GCC ld (in conjunction with compiling with -ffunction-sections and -fdata-sections), --gc-sections instructs the linker to discard any code or data section that is not reachable from the entry point. This can significantly reduce
flash usage by eliminating unused library functions and dead code. It is safe for most bare-metal projects, but
interrupt vector tables and other sections referenced only indirectly (for example, by hardware or by assembly) must be marked with KEEP() in the linker script to prevent them from being garbage-collected.
What is the difference between a static library and a shared library in the context of embedded linking?
A static library (.a file) is an archive of object files that the linker searches at build time, pulling in only the object files needed to resolve outstanding symbols. The resulting binary is self-contained. Shared libraries (.so files) are resolved at runtime by a dynamic linker and require an OS with a dynamic loader. Most bare-metal and RTOS-based embedded targets use only static linking because there is no OS-level dynamic loader. Shared libraries are found on embedded Linux platforms (Cortex-A class SoCs running Linux), but not on typical Cortex-M or similar bare-metal
MCU targets.
Differentiators vs similar concepts
The linker is often confused with the
assembler or the compiler because all three are invoked as part of a build. The compiler (e.g., gcc, clang, armcc) translates source to object code. The assembler translates assembly source to object code. The linker combines object files into a final binary. In many toolchains, invoking the compiler driver (gcc, arm-none-eabi-gcc) with source files appears to do everything in one step, but the driver internally calls the preprocessor, compiler, assembler, and finally the linker as separate stages. The linker is also sometimes confused with the locator, a term used in some toolchain documentation (particularly IAR) for the stage that assigns final addresses; in GCC-based toolchains, linking and locating are performed by the same tool (ld).