Later there will be, I hope, some people who will find it to their advantage to decipher all this mess.
— Évariste Galois, May 29, 1832
I was going to call this short series of articles “LFSRs for Dummies”, but thought better of it. What is a linear feedback shift register? If you want the short answer, the Wikipedia article is a decent introduction. But these articles are aimed at those of you who want a little bit deeper mathematical understanding, with some practical advice too.
Why study the LFSR? They have many practical applications in communications theory. You can use LFSRs to generate a pseudorandom bit sequence, or as a highspeed counter. But part of the reason to understand the theory behind LFSRs is just because the math has an inherent beauty.
(For a complete guide to the articles in this series, see the annotated Table of Contents.)
Linear Feedback Shift Registers: A Bitwise Approach
There are at least three ways to describe the linear feedback shift register, or LFSR: a practical bitwise approach, and two theoretical approaches based on the algebra of finite fields, one using polynomials and the other using matrices. We’ll start with the bitwise approach.
A shift register is a series of bit cells, each of which is a flipflop:
 it has an input signal
 it stores a single bit of state, either a 0 or a 1
 it changes that state bit to the input signal, upon receiving a clock signal
 it has an output signal that is the state of the shift register — if the shift register contains a 1, then it outputs a 1; if it contains a 0, then it outputs a 0
The output of one cell is connected to the input of the next, so that the bits in the shift register propagate from left to right, or right to left, depending on how the shift register is set up.
In the diagram above, the bits shift from right to left, and the cells are numbered from 0 to \( N1 \) (here we have a 5bit shiftregister, so \( N=5 \)). The state bits, collectively denoted \( S \), are individually denoted \( S_0 \) to \( S_{N1} \) from right to left. (There’s no strong reason to prefer righttoleft shifting rather than lefttoright shifting, but it does correspond more nicely with some of the theoretical math if we do so.) The input to cell 0 is \( u \), and the output of cell \( N1 \) is \( y \).
We can denote the history of the input at discrete instances in time \( 0, 1, 2, \ldots k \) by using brackets: \( S_j[k] \) represents the contents of bit \( j \) at timestep \( k \), and \( u[k] \) and \( y[k] \) represent the input and output at timestep \( k \).
If we just have a shift register, then it’s not very interesting:
 \( S_0[k] = u[k1] \) (cell #0 gets its input from the input signal \( u \))
 \( S_j[k] = S_{j1}[k1] \) for \( j>0 \) (each of the other cells gets its input from the previous cell)
 \( y[k] = S_{N1}[k] \)
This effectively produces \( S_j[k] = u[kj1] \) and \( y[k] = u[kN] \) because the input is delayed by \( j+1 \) and \( N \) timesteps, respectively.
So, for example, if the state \( S[0] \) = 11010
(treating \( S \) as an \( N \)bit number displayed with bit 0 as the rightmost bit), and part of the input waveform is \( u[0\ldots 5] = (0,1,0,0,1) \) then the next few steps would be
S[0] = 11010 u[0] = 0
S[1] = 10100 u[1] = 1
S[2] = 01001 u[2] = 0
S[3] = 10010 u[3] = 0
S[4] = 00100 u[4] = 1
S[5] = 01001
The \( u \) bits shift in from the right, and the \( y \) bits shift out from the left, so in the above example, the first 6 bits of \( y \) from \( y[0] \) to \( y[5] \) are \( (1,1,0,1,0,0) \). Got it? Again, not very interesting.
If you’re an electrical engineer, you can demonstrate this yourself with a 74HC595 shift register, some switches to control the input \( u \) and the clocking signal (after suitable debouncing), and some LEDs to display the state bits.
The interesting stuff comes when we use feedback to alter the input or the state. LFSRs do this in one of two ways.
Fibonacci LFSR
In a Fibonacci representation LFSR, the input \( u \) is set to some the state bits XOR’d together; its state equation is \( u[k] = \bigoplus\limits_{j=0}^{N1} b_jS_j[k] \) with coefficients \( b_j \) that are either 0 or 1, and here the symbol \( \bigoplus \) means to XOR all of its inputs together. So for example, if \( b_4 = b_2 = 1 \) and \( b_0 = b_1 = b_3 = 0 \), then the defining state equation is \( u[k] = S_4[k] \oplus S_2[k] = u[k5] \oplus u[k3] \) with the output having the same recurrence relation but delayed by 5 timesteps: \( y[k] = u[k5] = y[k5] \oplus y[k3] \). This LFSR is shown in the diagram below; the junction dot joining two input signals denotes an XOR operation.
In this example, suppose that the initial state is 00001
. Because there are taps on \( S_4 \) and \( S_2 \), the input \( u \) is set to \( S_4 \oplus S_2 \): 1 if either of those state bits are 1, 0 if neither or both are 1. Then the state changes over the first 10 timesteps are
k y S u
0 0 00001 0
1 0 00010 0
2 0 00100 1 (S2 = 1)
3 0 01001 0
4 1 10010 1 (S4 = 1)
5 0 00101 1 (S2 = 1)
6 0 01011 0
7 1 10110 0 (S2 = S4 = 1)
8 0 01100 1 (S2 = 1)
9 1 11001 1 (S4 = 1)
10 1 10011 1 (S4 = 1)
Galois LFSR
In a Galois representation LFSR, the input \( u \) is set to 0, but some of the state bits are altered by XOR’ing in the output:
 \( S_j[k] = S_{j1}[k1] \oplus a_jy[k1] \) for \( j > 0 \)
 \( S_0[k] = a_0y[k1] \)
 \( y[k] = S_{N1}[k] \)
For example, if \( a_0 = a_2 = 1 \) and the rest of the coefficients are zero, then we have \( S_0[k] = y[k1] \) and \( S_2[k] = S_1[k1] \oplus y[k1] \), with the rest of the bits propagating unchanged (\( S_4[k] = S_3[k1] \), \( S_3[k] = S_2[k1] \), and \( S_1[k] = S_0[k1] \)). This ends up producing the following bit patterns:
 \( S_0[k] = y[k1] \)
 \( S_1[k] = S_0[k1] = y[k2] \)
 \( S_2[k] = S_1[k1] \oplus y[k1] = y[k3] \oplus y[k1] \)
 \( S_3[k] = S_2[k1] = y[k4] \oplus y[k2] \)
 \( S_4[k] = S_3[k1] = y[k5] \oplus y[k3] \)
 \( y[k] = S_4[k] = y[k5] \oplus y[k3] \)
This LFSR is shown in the diagram below; the junction dot denotes a fanout (one input to multiple outputs), and the arrows into cells 0 and 2 denote that those cells XOR the bit \( y[k] \) into the cells’ normal inputs (and the input to cell 0 is 0), so \( S_2[k] = S_1[k1] \oplus y[k1] \) and \( S_0[k] = 0 \oplus y[k1] = y[k1] \).
In this example, suppose that the initial state is 00001
. Then the state changes over the first 10 timesteps are
k y S
0 0 00001
1 0 00010
2 0 00100
3 0 01000
4 1 10000
5 0 00101 (flip bits 2 and 0 since y[k1] = 1)
6 0 01010
7 1 10100
8 0 01101 (flip bits 2 and 0 since y[k1] = 1)
9 1 11010
10 1 10001
The order of the \( b_j \) and \( a_j \) are reversed in the two representations, that is, \( b_j = a_{N1j} \), but otherwise will produce the same output sequences with the same recurrence relation, although they have different internal state bits.
Fibonacci or Galois — Which is Better?
When people ask me which of the two to implement, I always recommend the Galois representation over the Fibonacci representation:

in a software implementation, it’s simpler than the Fibonacci LFSR, which requires parity calculation; here is one possible implementation of a Galois LFSR:
typedef struct { int32_t state; int32_t taps; int32_t ymask; } LFSR_T; LFSR_T lfsr; void lfsr_init(LFSR_T *plfsr) { plfsr>state = 1; plfsr>taps = 0x0005; // a0 = 1, a2 = 1 plfsr>ymask = 0x0010; // bit 4 is the highest significant state bit } /* Galois implementation of LFSR */ bool lfsr_step(LFSR_T *plfsr) { bool out = (plfsr>state & plfsr>ymask) != 0; plfsr>state <<= 1; // shift left if (out) { plfsr>state ^= plfsr>taps; /* flip bits connected to the taps if the output is 1. Note that the state bits beyond bit 4 are ignored */ } return out; }

in a hardware implementation, the Galois LFSR can operate at higher clock rates since it only uses 2input XOR gates, whereas the Fibonacci LFSR requires multiinput XOR gates that generally have a higher propagation delay

there is a more direct correspondence between the state bits in a Galois LFSR and the numbertheoretical representation of the LFSR
Visualizing the Galois LFSR
Below is an animation of a 5bit Galois LFSR with the coefficients 100101
:
The bottom row of the animation contains the state bits \( S[k] \); rows above contain the state bits at previous time instances \( S[k1], S[k2], \) and so on. The leftmost column of the animation contains the most recent sequence of output bits, from oldest at the top to the newest at the bottom.
Again, the behavior is very simple:
 at each step, shift the state bits left
 if we shifted out a 1, XOR all state bits with the coefficients (= flip state bits with coefficients 1).
Properties of the LFSR
Okay, what do we get out of this thing?
With properlychosen taps (and we’ll define more precisely what this means later), the LFSR is a maximallength LFSR, and its output sequence can be considered a pseudorandom bit sequence (PRBS) which has these properties:
 it repeats with a period \( 2^N1 \) where \( N \) is the number of bits in the LFSR (the allzero state never appears)
 all subsequences of length \( N \) appear, and are equally likely, except for all zeroes
 all subsequences of length \( 1, 2, 3, … N1 \) appear, and are almost equally likely (see below for an example)
 the correlation of the output sequence with a shifted version of itself is very low except at zero delay; this has a number of consequences in digital signal processing, one of which is that it is essentially a spreadspectrum pseudonoise sequence
For example, if \( N=8 \) then the period is \( 2^8 = 255 \), and the subsequence properties mean:
 all 8bit patterns
00000001
,00000010
,00000011
, etc. appear with frequency 1/255, except for00000000
which never appears  all 7bit patterns
0000001
,0000010
,0000011
, etc. appear with frequency 2/255 and the pattern0000000
appears with frequency 1/255  all 6bit patterns
000001
,000010
,000011
, etc. appear with frequency 4/255 and the pattern000000
appears with frequency 3/255  all 5bit patterns
00001
,00010
,00011
, etc. appear with frequency 8/255 and the pattern00000
appears with frequency 7/255  all 4bit patterns
0001
,0010
,0011
, etc. appear with frequency 16/255 and the pattern0000
appears with frequency 15/255  all 3bit patterns
001
,010
,011
, etc. appear with frequency 32/255 and the pattern000
appears with frequency 31/255  all 2bit patterns
01
,10
, and11
appear with frequency 64/255 and the pattern00
appears with frequency 63/255  the single bit
1
appears with frequency 128/255 and0
appears with frequency 127/255
The curious reader might ask: how do we know we can produce maximallength LFSRs, and how do we pick the “properlychosen” taps?
We’ll answer this question a little bit later in this article, and some more in Part II; first we have to understand some background. For this we have to turn to the theory of finite fields.
Groups, Rings, and Fields, Oh My!
Theoretical mathematicians like concepts that are unencumbered with reality, and don’t contain any “baggage” from all the realworld crap with its quirky exceptions — just pure abstract definitions. (“A monad is just a monoid in the category of endofunctors…”) Intuition based on practical experience is great for increasing understanding, but there’s the risk of bringing in some aspect of that practical experience which is inconsistent with the pure theory. Galileo’s experiments on falling bodies can be viewed as an example of this; acceleration under gravity is independent of mass, and can be verified by letting two stones of different masses fall, but if you use intuition based on rocks and feathers, you come up with a wrong conclusion because feathers have air resistance.
Along those lines, there are abstract algebraic structures called groups, rings, and fields. If you want to be rigorous about analyzing them, you have to disregard any analogous realworld systems… but as mere mortal amateur mathematicians, that’s also the only way we can build intuition about them.
So let’s start with a “times table”.
from IPython.core.display import HTML def binop_table(operator,operator_symbol,operand_set): def row(values,celltype='td'): return ('<'+celltype+'>' +('</%s><%s>' % (celltype,celltype)).join('%d' % v for v in values) +'</'+celltype+'>') return HTML('<table class="alignright"><tr><th>'+operator_symbol+'</th>' +row(operand_set,celltype='th') +'</tr><tr>' +'</tr>\n<tr>'.join('<th>%d</th>'%k +row(operator(j,k) for j in operand_set) for k in operand_set) +'</tr></table>') def times(a,b): return a*b binop_table(times,'×',xrange(10+1))
×  0  1  2  3  4  5  6  7  8  9  10 

0  0  0  0  0  0  0  0  0  0  0  0 
1  0  1  2  3  4  5  6  7  8  9  10 
2  0  2  4  6  8  10  12  14  16  18  20 
3  0  3  6  9  12  15  18  21  24  27  30 
4  0  4  8  12  16  20  24  28  32  36  40 
5  0  5  10  15  20  25  30  35  40  45  50 
6  0  6  12  18  24  30  36  42  48  54  60 
7  0  7  14  21  28  35  42  49  56  63  70 
8  0  8  16  24  32  40  48  56  64  72  80 
9  0  9  18  27  36  45  54  63  72  81  90 
10  0  10  20  30  40  50  60  70  80  90  100 
This is a diagram that shows an arithmetic operation × (multiplication) along with a set of operands (0,1,2,3,4,5,6,7,8,9,10). The arithmetic operation is a binary operator: it takes two inputs, \( a \) and \( b \), and computes some number \( c = a \times b \), where the result \( c \) is shown in the cells of the table.
We could do the same thing with + (addition) and the set of integers 0 through 11:
def plus(a,b): return a+b binop_table(plus,'+',xrange(12))
+  0  1  2  3  4  5  6  7  8  9  10  11 

0  0  1  2  3  4  5  6  7  8  9  10  11 
1  1  2  3  4  5  6  7  8  9  10  11  12 
2  2  3  4  5  6  7  8  9  10  11  12  13 
3  3  4  5  6  7  8  9  10  11  12  13  14 
4  4  5  6  7  8  9  10  11  12  13  14  15 
5  5  6  7  8  9  10  11  12  13  14  15  16 
6  6  7  8  9  10  11  12  13  14  15  16  17 
7  7  8  9  10  11  12  13  14  15  16  17  18 
8  8  9  10  11  12  13  14  15  16  17  18  19 
9  9  10  11  12  13  14  15  16  17  18  19  20 
10  10  11  12  13  14  15  16  17  18  19  20  21 
11  11  12  13  14  15  16  17  18  19  20  21  22 
Both of these examples share some similar properties:
 + and × are commutative: the order of the operands doesn’t matter, so \( a+b = b+a \) and \( a\times b = b\times a \).
 + and × have an identity element: \( a+0=0+a=a \) and \( a\times 1 = 1 \times a = a \) for any \( a \).
 The set of operands in each case is not closed under multiplication or addition: it produces a result which can be outside the original set of operands.
Furthermore, although it’s not directly visible in the tables above,
 + and × are associative: if I repeat the operation twice, the order doesn’t matter: \( (a+b)+c = a+(b+c) \) and \( (a\times b)\times c = a\times (b\times c) \).
Now let’s make one minor tweak, and show an addition table modulo 12; we’re going to wrap around so that the result is always between 0 and 11. If the result would normally be 12 or greater, we subtract a multiple of 12 so that it’s between 0 and 11. For example, \( 6+8 \bmod 12 = 14 \bmod 12 = 2 \).
def plus_mod_12(a,b): return (a+b)%12 binop_table(plus_mod_12, '+', xrange(12))
+  0  1  2  3  4  5  6  7  8  9  10  11 

0  0  1  2  3  4  5  6  7  8  9  10  11 
1  1  2  3  4  5  6  7  8  9  10  11  0 
2  2  3  4  5  6  7  8  9  10  11  0  1 
3  3  4  5  6  7  8  9  10  11  0  1  2 
4  4  5  6  7  8  9  10  11  0  1  2  3 
5  5  6  7  8  9  10  11  0  1  2  3  4 
6  6  7  8  9  10  11  0  1  2  3  4  5 
7  7  8  9  10  11  0  1  2  3  4  5  6 
8  8  9  10  11  0  1  2  3  4  5  6  7 
9  9  10  11  0  1  2  3  4  5  6  7  8 
10  10  11  0  1  2  3  4  5  6  7  8  9 
11  11  0  1  2  3  4  5  6  7  8  9  10 
Now we have something interesting:
 the set \( G = {0,1,2,3,4,5,6,7,8,9,10,11} \) is closed under addition modulo 12: we end up with results that are in the same set \( G \)
 each element in the set \( G \) has an inverse: if we add them modulo 12, we get the identity element 0. For example, \( 1+11=0 \), \( 2+10=0 \), and so on.
In this case, the operator \( + \) (mod 12) and the set \( G={0,1,…,11} \) form a group, and our binop_table
function displays something called a Cayley table. Groups in general are defined as a binary operator \( \cdot \) and a set \( G \) which have the following properties or axioms:
 closure: \( a \cdot b=c \) for \( a,b \) in \( G \) will always produce \( c \) in \( G \) (mathematical notation uses \( \in \) which just means “in” or “is an element of”, so closure can be expressed as \( a,b \in G \implies a \cdot b = c \in G \))
 associativity: \( (a \cdot b) \cdot c = a \cdot (b \cdot c) \)
 identity: there is an element \( e \) so that \( e \cdot a = a \cdot e = a \) for all \( a \in G \)
 inverse: for all \( a \in G \), there is a unique corresponding inverse element \( a^{1} \) such that \( a \cdot a^{1} = a^{1} \cdot a = e \)
There are also abelian groups which are commutative:
 commutativity: \( a \cdot b \) = \( b \cdot a \)
All other groups are noncommutative or nonabelian. (Matrix multiplication is noncommutative, for example, as is the group denoting all permutations of a Rubik’s Cube.)
Some groups are finite (the set \( G \) is finite), and some are infinite. The \( + \) mod 12 example is a finite group, whereas the group of integers under addition is an infinite group.
So why do we care?
Well, groups have some particular properties that are universal, so if you know how to handle some task in one case, the same methodology applies in another. Certain theorems and terminologies apply to any group, like the concept of order: the order of a group \( G \) means the number of elements in it; the order of an element \( a \) in a group means the number of times you need to apply the operation to \( a \) to produce the identity element. For example, the element \( 4 \) in the group \( + \) mod 12 has order 3, since \( 4+4+4 = 0 \).
Furthermore, there are examples of groups that are isomorphic, where there is a onetoone correspondence between the elements, so if \( a_1 \oplus b_1 = c_1 \) in some group \( G_1 \) defined for the operation \( \oplus \), and another group \( G_2 \) defined for the operation \( \otimes \) has corresponding elements \( a_2 \sim a_1 \), \( b_2 \sim b_1 \), and \( c_2 \sim c_1 \) such that \( a_2 \otimes b_2 = c_2 \). Ouch, sometimes the formality hurts my head.
We’ll show an example of isomorphism when we get to the LFSR. Also below is another example of a group that is isomorphic to addition modulo 12. Here we take modulo 13 and use the set of numbers from 1 to 12. (We can’t include 0, since 0 doesn’t have an inverse in multiplication.) This group is called \( \mathbb{Z} _ {13}{}^ * \): the symbol \( \mathbb{Z} \) is used to denote the set of all integers, \( \mathbb{Z}_{13} \) is the set from 0 to 12, and \( \mathbb{Z} _ {13}{}^ * \) just takes out 0.
def times_mod_13(a,b): return (a*b) % 13 binop_table(times_mod_13,'×',xrange(1,13))
×  1  2  3  4  5  6  7  8  9  10  11  12 

1  1  2  3  4  5  6  7  8  9  10  11  12 
2  2  4  6  8  10  12  1  3  5  7  9  11 
3  3  6  9  12  2  5  8  11  1  4  7  10 
4  4  8  12  3  7  11  2  6  10  1  5  9 
5  5  10  2  7  12  4  9  1  6  11  3  8 
6  6  12  5  11  4  10  3  9  2  8  1  7 
7  7  1  8  2  9  3  10  4  11  5  12  6 
8  8  3  11  6  1  9  4  12  7  2  10  5 
9  9  5  1  10  6  2  11  7  3  12  8  4 
10  10  7  4  1  11  8  5  2  12  9  6  3 
11  11  9  7  5  3  1  12  10  8  6  4  2 
12  12  11  10  9  8  7  6  5  4  3  2  1 
This one is interesting in a few ways:
 determining the inverse is a little more nontrivial; there are no obvious patterns here: \( 1\times 1 = 1,\, 2 \times 7 = 1,\, 3 \times 9 = 1,\, 4 \times 10 = 1,\, 5 \times 8 = 1,\, \ldots \)

the order of some elements is 12; we can form cycles by repeated multiplication, using each element as a generator of a subgroup (any group can also be considered a subgroup of itself):
 \( 3 \times 3 = 9, 9 \times 3 = 1 \), yielding the repeating sequence \( (1,3,9) \). This is a subgroup: multiplying elements in the set \( (1,3,9) \) modulo 13 will always produce another element in the same set.
 in the same way, the element \( 2 \) is a generator of the entire group, forming the repeating sequence \( (1,2,4,8,3,6,12,11,9,5,10,7) \)
 \( 1 \) is a generator of the sequence \( (1) \), another subgroup (with only one element)
 \( 4 \) is a generator of the sequence \( (1,4,3,12,9,10) \), another subgroup.
 \( 5 \) is a generator of the sequence \( (1,5,12,8) \), again, another subgroup.
 \( 6 \) is a generator of the sequence \( (1,6,10,8,9,2,12,7,3,5,4,11) \) forming the entire group.
 \( 7 \) is a generator of the sequence \( (1,7,10,5,9,11,12,6,3,8,4,2) \) forming the entire group.
 \( 8 \) is a generator of the sequence \( (1,8,12,5) \), again, another subgroup.
 \( 9 \) is a generator of \( (1,9,3) \), another subgroup
 \( 10 \) is a generator of the sequence \( (1,10,9,12,3,4) \), again, another subgroup.
 \( 11 \) is a generator of the sequence \( (1,11,4,5,3,7,12,2,9,8,10,6) \) forming the entire group.
 \( 12 \) is a generator of the sequence \( (1,12) \), again, another subgroup.
So this group \( \mathbb{Z}_{13}{}^* \) has one element (1) of order 1, one element (12) of order 2, two elements (3,9) of order 3, two elements (5,8) of order 4, two elements (4,10) of order 6, and four elements (2,6,7,11) of order 12. (Lagrange’s theorem says that the order of an element is always a divisor of the group order, 12 in this case.) Because some elements are generators of the group itself, forming a complete cycle of all elements, this is called a cyclic group.
Multiplication modulo any prime number \( p \) forms a group \( \mathbb{Z} _ p{}^ * \) with similar properties, forming cyclic subgroups generated by each of its elements, and the number of generators of the group \( \mathbb{Z} _ p{}^ * \) itself, having the full order \( p1 \), is always \( \phi(p1) \) where \( \phi(n) \) is Euler’s totient function. This can be computed by taking the product over each of its prime divisors p: \( \phi(n) = n\prod\limits _ {pn} \frac{p1}{p} \). For \( \phi(12) \) this is \( 12 \cdot \frac{1}{2} \cdot \frac{2}{3} = 4 \).
But how is this group isomorphic to the addition table modulo 12? It doesn’t look anything like the addition table, which has those nice symmetric diagonals.
The secret is in how we arrange the elements; we’ll just pick a generator \( g \) like 2, and list the elements in order of the sequence \( g^k \):
binop_table(times_mod_13,'×',(1,2,4,8,3,6,12,11,9,5,10,7))
×  1  2  4  8  3  6  12  11  9  5  10  7 

1  1  2  4  8  3  6  12  11  9  5  10  7 
2  2  4  8  3  6  12  11  9  5  10  7  1 
4  4  8  3  6  12  11  9  5  10  7  1  2 
8  8  3  6  12  11  9  5  10  7  1  2  4 
3  3  6  12  11  9  5  10  7  1  2  4  8 
6  6  12  11  9  5  10  7  1  2  4  8  3 
12  12  11  9  5  10  7  1  2  4  8  3  6 
11  11  9  5  10  7  1  2  4  8  3  6  12 
9  9  5  10  7  1  2  4  8  3  6  12  11 
5  5  10  7  1  2  4  8  3  6  12  11  9 
10  10  7  1  2  4  8  3  6  12  11  9  5 
7  7  1  2  4  8  3  6  12  11  9  5  10 
Tada! And here we see some more patterns if we express each element as a power of the generator \( g=2 \):
 \( 1 = g^0 \) has order 1
 \( 2 = g^1 \) has order \( 12/\gcd(12,1) = 12/1 = 12 \)
 \( 4 = g^2 \) has order \( 12/\gcd(12,2) = 12/2 = 6 \)
 \( 8 = g^3 \) has order \( 12/\gcd(12,3) = 12/3 = 4 \)
 \( 3 = g^4 \) has order \( 12/\gcd(12,4) = 12/4 = 3 \)
 \( 6 = g^5 \) has order \( 12/\gcd(12,5) = 12/1 = 12 \)
 \( 12 = g^6 \) has order \( 12/\gcd(12,6) = 12/6 = 2 \)
 \( 11 = g^7 \) has order \( 12/\gcd(12,7) = 12/1 = 12 \)
 \( 9 = g^8 \) has order \( 12/\gcd(12,8) = 12/4 = 3 \)
 \( 5 = g^9 \) has order \( 12/\gcd(12,9) = 12/3 = 4 \)
 \( 10 = g^{10} \) has order \( 12/\gcd(12,10) = 12/2 = 6 \)
 \( 7 = g^{11} \) has order \( 12/\gcd(12,11) = 12/1 = 12 \)
It also doesn’t matter which generator we use; the group \( \mathbb{Z}_{p}{}^* \) is isomorphic to itself and we can replace each element \( g^k \) with another element \( g^{mk} \) where \( m \) is relatively prime to the group order \( p1 \); we’ll get the same pattern:
binop_table(times_mod_13,'×',(1,6,10,8,9,2,12,7,3,5,4,11))
×  1  6  10  8  9  2  12  7  3  5  4  11 

1  1  6  10  8  9  2  12  7  3  5  4  11 
6  6  10  8  9  2  12  7  3  5  4  11  1 
10  10  8  9  2  12  7  3  5  4  11  1  6 
8  8  9  2  12  7  3  5  4  11  1  6  10 
9  9  2  12  7  3  5  4  11  1  6  10  8 
2  2  12  7  3  5  4  11  1  6  10  8  9 
12  12  7  3  5  4  11  1  6  10  8  9  2 
7  7  3  5  4  11  1  6  10  8  9  2  12 
3  3  5  4  11  1  6  10  8  9  2  12  7 
5  5  4  11  1  6  10  8  9  2  12  7  3 
4  4  11  1  6  10  8  9  2  12  7  3  5 
11  11  1  6  10  8  9  2  12  7  3  5  4 
Fermat’s little theorem that \( a^p \equiv a \pmod p \) is an immediate consequence of the structure of the cyclic group \( \mathbb{Z}_p{}^* \): you can take any element \( a \) of the group and it will always form cycles that divide the group order \( p1 \), so \( a = a^1 \equiv a^{1 + k(p1)} \pmod p \) for any \( k \), and Fermat’s little theorem is just the case where \( k=1 \).
This ordering of group elements, expressing them as successive powers of a generator, is very powerful, and it’s something that we’ll reuse later.
Okay, enough about groups, for now.
A ring is a group with BONUS POWERS (well, with the additional properties listed below, at least):
 it’s abelian (commutative) over the group’s binary operation \( \oplus \)
 it has a second binary operation \( \otimes \) which:
 forms a closed set over the ring’s elements
 is associative, so \( (a \otimes b) \otimes c = a \otimes (b \otimes c) \)
 is distributive over \( \oplus \):
 \( a \otimes (b \oplus c) = (a \otimes b) \oplus (a \otimes c) \)
 \( (b \oplus c) \otimes a = (b \otimes a) \oplus (c \otimes a) \)
 has an identity element (which in general is a different element from the identity element of \( \oplus \))
A field is a ring where both binary operations are commutative and have inverses; every element has an inverse with respect to \( \oplus \) and all elements except for the “zero” element (which is the identity element with respect to \( \oplus \)) has an inverse with respect to \( \otimes \).
Those are some rather abstract definitions and I won’t say much more than that until we get into some more specific examples. Think of a ring as having addition, subtraction, and multiplication welldefined; fields add division. “Ordinary arithmetic” with rational or real or complex numbers (\( \mathbb{Q}, \mathbb{R} \), and \( \mathbb{C} \), respectively) forms an infinite field. Integer arithmetic modulo \( p \) (where \( p \) is prime) forms a finite field.
There’s also something called an ideal and something else called a quotient ring and these just get too abstract for me to understand very well, so I have no hope in explaining their significance to anyone else, except in one very specific case that we’ll discuss in a moment.
I Am Mad at Évariste Galois
I am mad at Évariste Galois, a brilliant young mathematician who lived in postNapoleonic France and who was the originator of several major concepts in what later became known as group theory.
In addition to his mathematical activities, Galois became involved in radical political demonstrations. He got himself arrested at the age of 19, and was sentenced to six months in prison two days before his 20th birthday. A month after he was released, he participated in a duel, either for political reasons or for a romantic entanglement, and was shot and killed.
Niels Henrik Abel, a Norwegian mathematical prodigy and a contemporary of Galois, also tragically died young, at the age of 26, but at least his death can be blamed on tuberculosis rather than on highrisk activities.
Think of all the potential accomplishments that a 20yearold mathematician could have accomplished in the rest of his career. (Euler lived to be 76, Gauss and Lagrange lived to be 77, and Legendre lived to be 80.)
Galois Fields
Finite fields are named Galois fields in honor of Évariste Galois. Remember, a field has a set of elements and has two binary operations \( \oplus \) and \( \otimes \), which are abstract analogues to addition and multiplication; it also has analogues to subtraction and division using additive and multiplicative inverses. The simplest Galois field is known as GF(2) and it consists of two elements: 0 and 1. The binary operations are addition and multiplication, both done modulo 2:
# monkeypatch HTML to overload concatenation HTML.__add__ = lambda h1,h2: HTML(h1.data+h2.data) (binop_table((lambda a,b: (a+b)&1), '+', (0,1)) +binop_table((lambda a,b: a*b), '×', (0,1)))
+  0  1 

0  0  1 
1  1  0 
×  0  1 

0  0  0 
1  0  1 
That’s it. The additive identity element is 0, the multiplicative identity element is 1. If you interpret GF(2) in a bitwise manner, addition is like XOR, and multiplication is like AND.
Not very interesting.
The interesting part comes when you use GF(2) to create larger algebraic structures, like the polynomial ring \( GF(2)[x] \), which is essentially the set of polynomials in \( x \) with only the elements of GF(2) allowed as coefficients (so \( x^3 + x + 1 \) is such a polynomial, but \( x^3  x + 1 \) and \( x^3 +2x + 1 \) are not), and the quotient ring \( GF(2)[x]/p \) where \( p \) is a polynomial with coefficients 0 or 1. Here we are starting to get a bit abstract — I still have trouble with quotient rings — and an example will help.
Let’s use a specific \( p(x) = x^4 + x + 1 \), and just for ease of naming, let’s call the quotient ring \( GF(2)[x]/p = H_{13} \) for reasons that will become apparent later. There are 16 elements of \( H_{13} \), and they are the polynomials of degree 3 or less, with coefficients 0 or 1:
$$ \begin{array}{rrrr} 0 & 1 & x & x+1 \cr x^2 & x^2 + 1 & x^2 + x & x^2 + x + 1 \cr x^3 & x^3 + 1 & x^3 + x & x^3 + x + 1 \cr x^3 + x^2 & x^3 + x^2 + 1 & x^3 + x^2 + x & x^3 + x^2 + x + 1 \end{array}$$
\( H_{13} \) happens to be a finite field (which many mathematicians would just call \( GF(2^4) \) because the two are isomorphic, but never mind that) with addition and multiplication as the two operations. The way you add and multiply elements in \( H_{13} \) is the same way you add and multiply elements in any such quotient ring \( GF(2)[x]/p \), and here are the rules:
 adding elements: just add the coefficients mod 2; for example, if \( a = x^3 + x \) and \( b = x^2+x \), then \( a+b = (x^3 + x) + (x^2 + x) = x^3 + x^2 \).
 multiplying elements:
 multiply the polynomials
 take the remainder modulo \( p(x) \) with coefficients mod 2
The multiplying step is where things get really interesting. Essentially we can subtract any multiple of \( p(x) \) needed to make the result have a degree that is less than the degree of \( p(x) \); in the case of \( H_{13} \), the polynomial \( p(x) \) is of degree 4, so the elements of \( H_{13} \) are all degree 3 or less.
As a concrete example, let’s take \( a = x^3 + x \) and \( b = x^2+x \) again. If we want to calculate \( ab \),
$$\begin{align} ab &= (x^3 + x)(x^2+x) \cr &= x^5 + x^4 + x^3 + x^2 \cr &= x^5 + x^4 + x^3 + x^2  xp(x) \cr &= (x^5 + x^4 + x^3 + x^2)  (x^5 + x^2 + x) \cr &= x^4 + x^3 + x \cr &= x^4 + x^3 + x  p(x)\cr &= (x^4 + x^3 + x)  (x^4 + x + 1) \cr &= x^3 + 1 \end{align}$$
Let’s try to use \( x \) as a generator and see where it gets us:
$$\begin{align} x^0 &= 1 \cr x^1 &= x \cr x^2 &= x^2 \cr x^3 &= x^3 \cr x^4 &= x^4  p(x) \cr &= x + 1 \cr x^5 &= x(x^4) \cr &= x^2 + x \cr x^6 &= x^3 + x^2 \cr x^7 &= x^4 + x^3  p(x) \cr &= x^3 + x + 1 \cr x^8 &= x^4 + x^2 + x  p(x) \cr &= x^2 + 1 \cr x^9 &= x^3 + x \cr x^{10} &= x^4 + x^2  p(x) \cr &= x^2 + x + 1 \cr x^{11} &= x^3 + x^2 + x \cr x^{12} &= x^4 + x^3 + x^2  p(x)\cr &= x^3 + x^2 + x + 1 \cr x^{13} &= x^4 + x^3 + x^2 + x  p(x) \cr &= x^3 + x^2 + 1 \cr x^{14} &= x^4 + x^3 + x  p(x) \cr &= x^3 + 1 \cr x^{15} &= x^4 + x  p(x) \cr &= 1 = x^0 \end{align}$$
Yep! Since we have a cycle of period 15, we’ve hit all of the elements of \( H_{13} \) except for \( 0 \), which means that \( x \) is a generator of the multiplicative group of \( H_{13} \). Also this means we can express all of the nonzero elements as powers of \( x \): for example, \( a=x^3 + x = x^9 \) and \( b = x^2 + x = x^5 \) and by inspection we can tell that the product is \( ab = x^{9+5} = x^{14} = x^3 + 1 \).
This doesn’t always work; consider the multiplicative group of \( GF(2)[x]/(x^4+1) \):
$$\begin{align} x^0 &= 1 \cr x^1 &= x \cr x^2 &= x^2 \cr x^3 &= x^3 \cr x^4 &= x^4  p(x) \cr &= 1 = x^0\cr \end{align}$$
So \( x \) is not a generator of this group; it has a cycle of length 4 and leaves out other nonzero elements like \( x^2 + x \) or \( x^3 + 1 \).
Polynomials \( p(x) \) such that \( x \) is a generator of the multiplicative group of \( GF(2)[x]/p(x) \) are called primitive polynomials. And all that really means is
 you start with \( 1 \)
 each step, you multiply by \( x \)
 when you get a term \( x^N \) (where \( N \) is the degree of \( p(x) \)), then you subtract off \( p(x) \), remembering to take coefficients modulo 2
 if the next time you get a \( 1 \) is at \( x^m \) with \( m = 2^N  1 \), then you’ve generated all the nonzero elements of the field and \( p(x) \) is a primitive polynomial.
Isomorphism between Nbit LFSRs and \( GF(2^N) \)
Okay, now we’re going to come back to the LFSR. It turns out that there is a very neat isomorphism between the bitwise approach for describing a Galois LFSR, and the quotient ring \( GF(2)[x]/p(x) \). Isomorphism is Greek for “equal form” and just means that for these two concepts, there are mappings back and forth between each aspect, and we can apply insights from one of them to the other.
Ready?
Galois LFSR  \( GF(2)[x]/p(x) \) 

LFSR state  nonzero element 
period  order of the multiplicative group 
state update: shift left, XOR with coefficients if necessary  multiplication by \( x \), subtract \( p(x) \) if necessary 
state \( S[k] \) with \( S[0] = \) `0000...001`  element \( x^k \) 
coefficients \( (a_{N1},\ldots,a_2,a_1,a_0) \)  polynomial \( p(x) = a_{N1}x^{N1} + \ldots + a_2x^2 + a_1x + a_0 \) 
Perhaps this is too abstract. Let’s consider the specific example \( H_{13} = GF(2)[x]/(x^4 + x + 1) \) and its corresponding Galois LFSR with binary coefficients 10011 = 0x13
(which is why we’re calling it \( H_{13} \)!). The LFSR has a period of 15, which is the order of the multiplicative group of \( H_{13} \).
Now let’s track the various states on each step, along with the corresponding elements of \( H_{13} \), and what you’ll notice is that each bit in the LFSR state is analogous to the corresponding term of the field element (so, for example, bit 3 of the LFSR at timestep k
corresponds to the \( x^3 \) coefficient in the element \( x^k \)):
def bits_as_polynomial(bits, show_zeros=False, nmin=0, phantom=False): terms = [] k = 0 while bits > 0 or k < nmin: monomial = 'x^{%d}' % k if k >= 2 else ('x' if k == 1 else '1') if bits & 1: terms.append(monomial) elif show_zeros: terms.append('0' if not phantom else '\\phantom{'+monomial+'}\\llap{0}') k += 1 bits >>= 1 return ' + '.join(terms[::1]) def bits_as_binary(bits, nmin=0): result = '' while bits > 0 or nmin > 0: result = ('1' if bits & 1 else '0') + result bits >>= 1 nmin = 1 return result def show_LFSR_cycle(polybits, nmax=float('inf')): polybits_binstr = bits_as_binary(polybits) nbits = len(polybits_binstr)1 html = ('<table class="alignright"><tr><th>k</th><th>LFSR<br/> coefficients <code>' +polybits_binstr+('</code></th><th>\\(H_{%x}=GF(2)[x]/p(x),\\)<br />\\(p(x)=%s\\)</th></tr>' % (polybits, bits_as_polynomial(polybits))) ) state = 1 k = 0 while k < nmax: html += ('<tr><td>%d</td>' % k) html += ('<td>\\(S[%d] = \\)<code>' % k)+bits_as_binary(state, nbits)+'</code></td>' html += ('<td>\\(x^{%d} = '%k)+bits_as_polynomial(state,True,nmin=nbits,phantom=True)+'\\)</td></tr>' if state == 1 and k > 0: break # now update state state <<= 1 if (state >> nbits) > 0: state ^= polybits k += 1 html += '</table>' #return bits_as_polynomial(polybits) return HTML(html) show_LFSR_cycle(0x13, 50)
k  LFSR coefficients 10011  \(H_{13}=GF(2)[x]/p(x),\) \(p(x)=x^{4} + x + 1\) 

0  \(S[0] = \)0001  \(x^{0} = \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
1  \(S[1] = \)0010  \(x^{1} = \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + x + \phantom{1}\llap{0}\) 
2  \(S[2] = \)0100  \(x^{2} = \phantom{x^{3}}\llap{0} + x^{2} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
3  \(S[3] = \)1000  \(x^{3} = x^{3} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
4  \(S[4] = \)0011  \(x^{4} = \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + x + 1\) 
5  \(S[5] = \)0110  \(x^{5} = \phantom{x^{3}}\llap{0} + x^{2} + x + \phantom{1}\llap{0}\) 
6  \(S[6] = \)1100  \(x^{6} = x^{3} + x^{2} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
7  \(S[7] = \)1011  \(x^{7} = x^{3} + \phantom{x^{2}}\llap{0} + x + 1\) 
8  \(S[8] = \)0101  \(x^{8} = \phantom{x^{3}}\llap{0} + x^{2} + \phantom{x}\llap{0} + 1\) 
9  \(S[9] = \)1010  \(x^{9} = x^{3} + \phantom{x^{2}}\llap{0} + x + \phantom{1}\llap{0}\) 
10  \(S[10] = \)0111  \(x^{10} = \phantom{x^{3}}\llap{0} + x^{2} + x + 1\) 
11  \(S[11] = \)1110  \(x^{11} = x^{3} + x^{2} + x + \phantom{1}\llap{0}\) 
12  \(S[12] = \)1111  \(x^{12} = x^{3} + x^{2} + x + 1\) 
13  \(S[13] = \)1101  \(x^{13} = x^{3} + x^{2} + \phantom{x}\llap{0} + 1\) 
14  \(S[14] = \)1001  \(x^{14} = x^{3} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
15  \(S[15] = \)0001  \(x^{15} = \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
Or \( H_{25} \) which is isomorphic to the 5cell LFSR we showed earlier:
show_LFSR_cycle(0x25,50)
k  LFSR coefficients 100101  \(H_{25}=GF(2)[x]/p(x),\) \(p(x)=x^{5} + x^{2} + 1\) 

0  \(S[0] = \)00001  \(x^{0} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
1  \(S[1] = \)00010  \(x^{1} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + x + \phantom{1}\llap{0}\) 
2  \(S[2] = \)00100  \(x^{2} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + x^{2} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
3  \(S[3] = \)01000  \(x^{3} = \phantom{x^{4}}\llap{0} + x^{3} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
4  \(S[4] = \)10000  \(x^{4} = x^{4} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
5  \(S[5] = \)00101  \(x^{5} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + x^{2} + \phantom{x}\llap{0} + 1\) 
6  \(S[6] = \)01010  \(x^{6} = \phantom{x^{4}}\llap{0} + x^{3} + \phantom{x^{2}}\llap{0} + x + \phantom{1}\llap{0}\) 
7  \(S[7] = \)10100  \(x^{7} = x^{4} + \phantom{x^{3}}\llap{0} + x^{2} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
8  \(S[8] = \)01101  \(x^{8} = \phantom{x^{4}}\llap{0} + x^{3} + x^{2} + \phantom{x}\llap{0} + 1\) 
9  \(S[9] = \)11010  \(x^{9} = x^{4} + x^{3} + \phantom{x^{2}}\llap{0} + x + \phantom{1}\llap{0}\) 
10  \(S[10] = \)10001  \(x^{10} = x^{4} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
11  \(S[11] = \)00111  \(x^{11} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + x^{2} + x + 1\) 
12  \(S[12] = \)01110  \(x^{12} = \phantom{x^{4}}\llap{0} + x^{3} + x^{2} + x + \phantom{1}\llap{0}\) 
13  \(S[13] = \)11100  \(x^{13} = x^{4} + x^{3} + x^{2} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
14  \(S[14] = \)11101  \(x^{14} = x^{4} + x^{3} + x^{2} + \phantom{x}\llap{0} + 1\) 
15  \(S[15] = \)11111  \(x^{15} = x^{4} + x^{3} + x^{2} + x + 1\) 
16  \(S[16] = \)11011  \(x^{16} = x^{4} + x^{3} + \phantom{x^{2}}\llap{0} + x + 1\) 
17  \(S[17] = \)10011  \(x^{17} = x^{4} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + x + 1\) 
18  \(S[18] = \)00011  \(x^{18} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + x + 1\) 
19  \(S[19] = \)00110  \(x^{19} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + x^{2} + x + \phantom{1}\llap{0}\) 
20  \(S[20] = \)01100  \(x^{20} = \phantom{x^{4}}\llap{0} + x^{3} + x^{2} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
21  \(S[21] = \)11000  \(x^{21} = x^{4} + x^{3} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + \phantom{1}\llap{0}\) 
22  \(S[22] = \)10101  \(x^{22} = x^{4} + \phantom{x^{3}}\llap{0} + x^{2} + \phantom{x}\llap{0} + 1\) 
23  \(S[23] = \)01111  \(x^{23} = \phantom{x^{4}}\llap{0} + x^{3} + x^{2} + x + 1\) 
24  \(S[24] = \)11110  \(x^{24} = x^{4} + x^{3} + x^{2} + x + \phantom{1}\llap{0}\) 
25  \(S[25] = \)11001  \(x^{25} = x^{4} + x^{3} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
26  \(S[26] = \)10111  \(x^{26} = x^{4} + \phantom{x^{3}}\llap{0} + x^{2} + x + 1\) 
27  \(S[27] = \)01011  \(x^{27} = \phantom{x^{4}}\llap{0} + x^{3} + \phantom{x^{2}}\llap{0} + x + 1\) 
28  \(S[28] = \)10110  \(x^{28} = x^{4} + \phantom{x^{3}}\llap{0} + x^{2} + x + \phantom{1}\llap{0}\) 
29  \(S[29] = \)01001  \(x^{29} = \phantom{x^{4}}\llap{0} + x^{3} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
30  \(S[30] = \)10010  \(x^{30} = x^{4} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + x + \phantom{1}\llap{0}\) 
31  \(S[31] = \)00001  \(x^{31} = \phantom{x^{4}}\llap{0} + \phantom{x^{3}}\llap{0} + \phantom{x^{2}}\llap{0} + \phantom{x}\llap{0} + 1\) 
If You’ve Seen One Finite Field, You’ve Seen Them All
The odd thing to me about finite fields is that all finite fields of the same order are isomorphic. So any finite field that has \( 2^N \) elements is isomorphic to \( GF(2^N) \) and isomorphic to an Nbit LFSR; understanding any one of these leads to insights for all the others.
Fields are more complex than groups, and yet finite fields have less variety. I guess this is similar in some ways to poetry: there is a huge variety of poetry that has been written, from simple couplets to great epics, some that rhyme and some that do not, some that have consistent meter and some that do not — but if you look at sonnets or limericks they are all basically the same exact structure.
That’s all we’ll be covering today; it’s really all you need to know, in order to create a working LFSR in most programming languages.
Wrapup
Today we looked at LFSRs, which have \( N \) bits of state that change at each step, and produce one output bit at each step. They can be implemented in two ways:
 Fibonacci LFSRs, which shift in an XOR of a given subset of the state bits from the previous step
 Galois LFSRs, which shift in 0 and then XOR a given subset of the state bits (the “taps”) with the output from the previous step
The Galois LFSR is easier to implement; the Nbit Galois LFSR also has more direct correspondence with a particular quotient ring \( GF(2)[x]/p(x) \) where \( p(x) \) is a polynomial of degree N with coefficients 0 or 1. (Each polynomial term with a coefficient of 1 represents a corresponding XOR tap of the LFSR.) This quotient ring is also a finite field, isomorphic to \( GF(2^N) \). There are particular polynomials, called primitive polynomials, which have \( x \) as a generator with maximum order of \( 2^N  1 \), and each corresponding LFSR has period \( 2^N1 \).
We talked about groups, which are sets that remain closed under a particular binary operation (analogous to addition or to multiplication), and have other properties like an identity element, associativity, and inverses. We talked about fields, which are commutative groups that also have a second commutative binary operation (if the first binary operation is analogous to addition, the second is analogous to multiplication) forming a commutative group with all the elements except the identity element of the first binary operation.
Welcome to abstract algebra. We’re not going to dive very deep into it; the kinds of things I cover in this series are really only here for reference, and if you want you can just ignore the heavy math and stick to studying the algorithms or procedures.
There are a number of unanswered questions that the observant reader may have considered, which we’ll be covering in the next few articles in this series, including:
 How to find primitive polynomials of a given degree
 Given the output bits of an LFSR, how to find the corresponding polynomial
 Given the state bits \( S[k] \) of an LFSR, how to find the elapsed number of steps \( k \)
 How do the different state bits of an LFSR relate to each other
Further Reading
If you decide after reading this series that you want more of the theoretical math, I would highly recommend John Kerl’s Computation in Finite Fields which is very technical, and has a very broad coverage of the subject along with lots of good practical applications, but is lightweight on proofs.
Another good source is Solomon Golomb’s Shift Register Sequences, which focuses on… well, shift registers, including all sorts of nonlinear variants of the LFSR. I haven’t read the 3rd edition; the previous publisher of the 2nd edition, Aegean Park Press, appears to be defunct, so it’s good that someone else picked it up.
For rigorous math with proofs, here’s some online references (of various lengths) on finite fields:

Venkat Anantharam, Lecture notes on finite fields, UC Berkeley, 2005

Avi Kak, Polynomial Arithmetic: Theoretical Underpinnings of Modern Cryptography, Purdue University, 2017

Sophie Huczynska and Max Neunhöffer, Finite Fields, RWTH Aachen University, 2013
There’s also Niven and Zuckerman’s book An Introduction to the Theory of Numbers; I can’t find my copy at the moment, but if you want to dive deeply into discrete mathematics in general, it’s a good place to learn how to swim. Don’t get discouraged if your eyes glaze over.
A Few Side Notes
Skip this section if you are impatient; it has no technical content.
I know my business card says “engineer” and I’ve been doing “engineering” work for the last 20 years, but really I’m an applied mathematician in disguise. (Just wanted you to know.) When I was a child in school, I was sure I was going to pursue a career in mathematics. I loved math… up until maybe my junior year in high school, at which time I got the sinking suspicion that the kind of mathematics which I liked — trigonometry and calculus at the time — was not the same kind of mathematics that a modern mathematician actually pursued. The trig and calc was old hat. Already solved. Sorry. The unsolved stuff was all cryptic and obscure.
So then I wanted to study physics, because I liked mechanics and electromagnetism, and it was much more appealing than those other sciences like biology and chemistry that never made much sense to me. But early in college, maybe my freshman year, I got that sinking suspicion again that the kind of physics I liked was not the same physics that an actual professional physicist pursued. The mechanics and E&M were already solved; cuttingedge theoretical research was some crazy stuff with quantum mechanics or string theory or whatnot, and the applied physics was… oh, I don’t remember, none of it appealed to me, stuff with lasers or liquid helium, who knows; it might have just been the general malaise of the physics majors I met, that made me lose interest.
Then I turned to electrical engineering and it was Good Enough. And here I am!
But every so often I brush up against some area of mathematics that resonates with me, I guess because it is elegant in its own right and has no dependency on technology or on some realworld problem someone is trying to solve. (Particularly the realworld engineering problems that I am trying to solve; I need a break sometimes.) There are a few of these topics, particularly in the area of discrete mathematics.
One of them is cryptography. I clearly remember reading about RSA publickey cryptography around 1985 or 1986, from a copy of Martin Gardner’s Mathematical Games column in the August 1977 issue of Scientific American that was sitting on our bookshelf. I liked it because it was interesting and somewhat illicitsounding (unbreakable secret codes!) and the math was just barely within my grasp. Nowadays I understand the basics of RSA pretty well, but the practical aspects of modern cryptography are much more subtle, and constructing your own cryptography from scratch can be a risky endeavour if you truly want information to stay secure.
The other area involves the LFSR. I think I first heard about these around the year 2000 or so after purchasing and reading a copy of Numerical Recipes in C. I don’t know what it is about an LFSR, there’s just something simple and symmetric and beautiful there, like a spinning bicycle wheel. The concepts of Galois fields aren’t just confined to LFSRs; they also show up in some related areas like CRCs and ReedSolomon encoding.
What’s With the ExPralite Monks?
The curse of the monad is that once you get the epiphany, once you understand, “Oh, that’s what it is,” you lose the ability to explain it to anybody else.
— Douglas Crockford, “Monads & Gonads” (YUIConf 2012)
Being an amateur in any technical field is tough, because you’re decades or even centuries behind experts in the field, especially in mathematics, where so much of it is layers upon layers of abstraction. It’s sort of the knowledge equivalent of software that pulls in dependencies on tons of libraries. I don’t want to have to know how to use Spring and Hibernate just to learn Java. But math is like that. If you really want to start from first principles (arithmetic, with a smattering of basic algebra, trigonometry, geometry, and maybe a little calculus), there are going to be a lot of areas of mathematics that are way out of reach because there’s such a huge learning curve, and because the dependencies are serial: once I understand concepts 1, 2, 3, … N, I might be able to begin understanding concept N+1 and if I’m lucky then N+2, but I can’t go on until I have a solid understanding. You don’t jump ahead by leaps and bounds in understanding mathematics unless you’re a prodigy. And conversely, if you want to teach someone concept N+1, if you’re at an understanding of concept N+4, you have to somehow forget everything you know about N+4 and N+3 and maybe even N+2, and focus on N and N+1. It’s tough. Crockford’s quip on the curse of the monad is really relevant to almost any abstract part of mathematics; Brent Yorgey’s article on the “monad tutorial fallacy” also explains the problem:
The heart of the matter is that people begin with the concrete, and move to the abstract. Humans are very good at pattern recognition, so this is a natural progression. By examining concrete objects in detail, one begins to notice similarities and patterns, until one comes to understand on a more abstract, intuitive level.
…
What I term the “monad tutorial fallacy,” then, consists in failing to recognize the critical role that struggling through fundamental details plays in the building of intuition. This, I suspect, is also one of the things that separates good teachers from poor ones. If you ever find yourself frustrated and astounded that someone else does not grasp a concept as easily and intuitively as you do, even after you clearly explain your intuition to them (“look, it’s really quite simple,” you say…) then you are suffering from the monad tutorial fallacy.
If we want to teach someone abstract mathematics, somehow we need to capture in our brain this essence of what it’s like to have a fresh understanding of a subject, before it is so ingrained in our mind that we can’t remember what it was like without it. It reminds me of a passage in Douglas Adams’s The Restaurant at the End of the Universe:
The Galaxy is littered with exPralite monks, all on the make, because the mental control techniques the Order have evolved as a form of devotional discipline are, frankly, sensational — and extraordinary numbers of monks leave the Order just after they have finished their devotional training and just before they take their final vows to stay locked in small metal boxes for the rest of their lives.
So I’m going to try to be an exPralite monk (but in a good way!), and do what I can to teach you about the mathematics of the LFSR, before I understand it too well and can no longer explain it.
The final note here is that the person I am mainly writing this article for (along with many of my other articles) is myself, so that 5 years from now when I can’t remember a darned thing about the theory behind LFSRs, I can learn from the exPralite monk I used to be. Sometimes my coworkers are the intended audience, so I can just send them away with a link to an article, and then I can get my own work done. If you’re reading this, you’re probably a distant third when it comes to my priorities. But I still hope you’ll find it useful.
Dedicated in memory of Évariste Galois and Douglas Adams, who died before they were ready.
© 2017 Jason M. Sachs, all rights reserved.