SRAM (Static Random-Access Memory) is a type of volatile memory that stores each bit in a flip-flop circuit, retaining its value as long as power is supplied without requiring periodic refresh cycles. It is the dominant form of on-chip RAM in microcontrollers and embedded processors, used to hold variables, the stack, heap, and data buffers at runtime.
In practice
On most microcontrollers, SRAM is the primary writable memory available at full CPU speed, though some devices also expose other tightly-coupled or closely-mapped regions with comparable access times. It holds all runtime state: global and static variables, the call stack, heap allocations (if any), and temporary buffers. Sizes range from a few hundred bytes on small 8-bit parts (e.g., 128 bytes on the ATtiny13A) to hundreds of kilobytes on Cortex-M4/M7 devices (e.g., 256 KB on some STM32F4 variants, up to 1 MB on certain STM32H7 variants), and several megabytes on application-class SoCs that integrate tightly-coupled SRAM alongside external DRAM.
SRAM is consumed from two directions on most bare-metal targets: the stack grows downward from the top of the SRAM region, while statically allocated data and the heap occupy addresses upward from the bottom. Stack overflow occurs when these regions collide, often silently corrupting data rather than triggering an immediate fault on parts without an MPU. The blog post "Are We Shooting Ourselves in the Foot with Stack Overflow?" covers this failure mode in detail. Careful linker script configuration and stack-size analysis are essential, especially in systems using deep interrupt nesting or recursion.
SRAM is fast but scarce. On resource-constrained devices, fitting an algorithm into available SRAM is a real design constraint, not an afterthought. The blog post "Flood Fill, or: The Joy of Resource Constraints" illustrates how algorithm choice is driven directly by SRAM limits. Common techniques to reduce SRAM pressure include storing constant lookup tables in flash (using const and linker directives or, on AVR, PROGMEM), using smaller data types, and avoiding deep call stacks.
Some MCUs provide multiple physically separate SRAM banks that may be accessible by distinct bus masters in parallel, depending on the device's bus matrix and DMA configuration, which is useful for DMA double-buffering or for keeping tightly-coupled memory (TCM) close to the CPU. On Cortex-M7 parts like the STM32H7, ITCM and DTCM are distinct from the general AXI SRAM and offer low-latency, deterministic access (typically single-cycle at supported clock configurations); code or data must be explicitly placed there via linker sections or at runtime.
Frequently asked
What is the difference between SRAM and DRAM?
SRAM stores each bit in a flip-flop (typically 6 transistors per cell) and holds its value as long as power is present, with no refresh needed. DRAM stores each bit as a charge on a capacitor (1
transistor + 1 capacitor per cell), which leaks and must be refreshed periodically by a memory controller. SRAM is faster, simpler to interface, and more expensive per bit; DRAM achieves much higher density at lower cost. Nearly all on-chip
MCU RAM is SRAM. External DRAM (
SDRAM, LPDDR) is used when an application-class SoC or a high-end MCU with an FMC/FSMC peripheral needs more RAM than can fit on-die.
Why does SRAM content get corrupted or lost on reset or power cycle?
SRAM is
volatile: it requires continuous power to maintain its state. On power-up, each cell settles into a semi-random state (often with a vendor-specific but not guaranteed bias), so SRAM contents after a cold reset are effectively undefined. The startup code (typically crt0 or the C runtime init in a vendor HAL) initializes the .bss section to zero and copies initial values for .data from
flash before main() runs. If startup code is bypassed or a warm reset occurs, previously written values may persist, but this should not be relied upon unless the
MCU datasheet explicitly defines retention behavior across that specific reset type.
How do I find out how much SRAM my application is using?
The
linker map file is the most reliable source. It lists the size of each section (.data, .bss, heap,
stack) and their placement in SRAM. Many toolchains also report section sizes via the 'size' utility (e.g., arm-none-eabi-size). These figures cover static allocation; peak stack depth at runtime requires additional analysis, either through static analysis tools, watermark patterns (filling the stack with a known value like 0xDEADBEEF at startup and checking how far it was overwritten), or hardware watchpoints.
What is tightly-coupled memory (TCM) and how does it differ from general SRAM?
TCM is a dedicated SRAM bank connected directly to the CPU core via a private bus, bypassing the main system interconnect and any caches. On Cortex-M7 parts (e.g., STM32H7, MIMXRT1060), ITCM and DTCM provide deterministic low-
latency access regardless of other bus traffic, making them suitable for time-critical ISRs or
DSP loops. General-purpose SRAM on the same device may be accessed over a shared AXI bus and can experience variable latency under load. Placement into TCM requires explicit
linker section assignments or runtime copy.
Can SRAM be used for memory-mapped I/O?
No. Memory-mapped I/O refers to peripheral
registers (timers,
UARTs,
GPIO, etc.) that are mapped into the processor's address space, typically in a region separate from SRAM. SRAM is used to store program data, not to access hardware registers. The blog post 'Memory Mapped I/O in C' explains how to safely access peripheral registers in C using
volatile-qualified pointers, which is a distinct concept from using SRAM for data storage.
Differentiators vs similar concepts
SRAM is often contrasted with DRAM (requires refresh, higher density, used as external
RAM on application processors),
Flash (non-
volatile, used for code and const data storage, slower to write, limited write endurance), and
EEPROM (non-volatile, byte-addressable, very low write endurance compared to Flash). Within SRAM itself, on-chip SRAM should be distinguished from external SRAM chips (connected via parallel bus or
SPI), which are slower and require additional interface circuitry. TCM (tightly-coupled memory) is a subtype of on-chip SRAM with a private CPU connection for deterministic
latency, found on higher-end Cortex-M7 and Cortex-A parts.