A checksum is a value computed from a block of data and appended to it so that a receiver or reader can verify data integrity by recomputing the value and comparing it to the stored or transmitted one. Any mismatch indicates that the integrity check failed, which may be caused by data corruption, a transmission error, mismatched algorithm parameters, implementation bugs, or other factors.
In practice
Checksums appear in nearly every layer of embedded communication and storage. Serial protocols such as UART-framed packet formats, bootloader image transfers, and EEPROM/flash parameter blocks all use some form of checksum to detect silent corruption. The simplest forms, such as an 8-bit sum of all bytes modulo 256 or an XOR-reduction, are cheap to compute in software or hardware but have poor error-detection coverage. The blog post "Help, My Serial Data Has Been Framed: How To Handle Packets When All You Have Are Streams" illustrates how a checksum field fits into a typical framed packet design for serial streams.
The term "checksum" is often used loosely to mean any integrity value, including CRCs and cryptographic hashes. In strict usage, simple additive checksums (e.g., a byte sum or XOR) are the baseline form, whereas a CRC is a polynomial remainder computation with much stronger burst-error detection. Choosing the wrong algorithm for the application is a common source of field failures. The blog post "Bad Hash Functions and Other Stories: Trapped in a Cage of Irresponsibility and Garden Rakes" covers how weak integrity functions let real errors slip through undetected.
In firmware update and bootloader contexts, a checksum or CRC over the application image is typically stored in a known flash location and verified at startup before the image is executed. Getting the byte order and the range of bytes covered exactly right is critical and a frequent source of bugs; the blog post "Endianness and Serial Communication" is relevant when the checksum value itself must be transmitted or stored in a multi-byte field. Similarly, "The CRC Wild Goose Chase: PPP Does What?!?!?!" documents how subtle differences in polynomial, initial value, and bit-reflection settings can cause a computed value to never match even when the underlying algorithm seems correct.
Frequently asked
What is the difference between a checksum and a CRC?
A checksum is typically a simple additive sum or XOR of data bytes. A
CRC (cyclic redundancy check) is a polynomial long-division remainder. For standard CRC polynomials with a nonzero constant term, a degree-r CRC detects all single-bit errors and all burst errors of length up to r, along with many other error patterns that a simple sum or XOR will miss. For robust communication or storage, prefer a CRC over a plain checksum.
Which checksum algorithm should I use for a simple embedded protocol?
For low-overhead cases an 8-bit or 16-bit sum with modular reduction is common and easy to implement, but it cannot detect swapped bytes or many multi-byte errors. An XOR checksum is even weaker. If error detection quality matters, use
CRC-8, CRC-16, or CRC-32 depending on data size and required coverage. Avoid inventing a custom algorithm; standard polynomials have known, documented properties.
Can a checksum detect all errors?
No. A simple sum or XOR checksum has blind spots: byte swaps within a word, certain pairs of bit flips, and any errors that happen to cancel out will produce the correct checksum on corrupted data. A
CRC is stronger but still not infallible for all error patterns. Cryptographic MACs or hashes are needed when the integrity check must also resist intentional tampering.
Where should the checksum field be placed in a packet?
Convention and most standard protocols place the checksum at the end of the frame so the receiver can accumulate the running value as bytes arrive and then compare against the final field. Some protocols place a length field and checksum at the start, which requires buffering the entire payload before validation. The choice affects both memory use and
latency; placing it at the end is generally simpler for stream-oriented parsing.
How do I verify a firmware image with a checksum at boot time?
Store the expected checksum (or
CRC) at a fixed address in
flash, typically in a header or trailer struct alongside the image length and version. At startup, recompute the value over the exact same byte range used when the image was built, then compare. Make sure the checksum field itself is excluded from or consistently included in the computed range, that endianness matches between the build tool and the target, and that no bytes are skipped due to alignment padding.
Differentiators vs similar concepts
Checksum is frequently conflated with
CRC and with cryptographic hash. A checksum (sum or XOR) is the weakest of the three: fast and simple but poor at catching burst errors and unable to detect byte-order swaps. A CRC uses polynomial arithmetic to provide much stronger error detection; on embedded platforms the cost is often affordable, especially with hardware CRC peripherals or table-driven implementations, though a naive bitwise CRC can be heavier than a simple sum on small MCUs. A cryptographic hash (MD5, SHA-256, etc.) is far more expensive but provides collision resistance and is appropriate when the data source is untrusted. Checksum is also sometimes confused with "parity," which is essentially a 1-bit checksum applied per byte or per word rather than across an entire block.