# Elliptic Curve Cryptography

Secure online communications require encryption.  One standard is AES (Advanced Encryption Standard) from NIST.  But for this to work, both sides need the same key for encryption and decryption.  This is called Private Key encryption.  Public Key encryption is used to create a private key between two sides that have not previously communicated.  Compared to the history of encryption, Public Key methods are very recent having been started in the 1970's.  Elliptic Curve Public Key cryptography started in the mid 1980's and a great deal of research has shown it is highly secure and efficient.

Elliptic curve public key crypto systems have a lot of advantages compared with other public key methods. One of the major advantages is that any value can be a key.  This means the hash of a pass phrase can be used and for very secure systems it never needs to be stored.  Additionally the security of the system is fully exponential in the field over which it is computed.

Elliptic curve math has very little to do with conic sections.  It is a two dimensional curve with an equation of the form

$$y^2 + a_1 x y + a_3 y = x^3 + a_2 x^2 + a_4 x + a_6$$

This form applies when x and y are real as well as when they are modulo a prime greater than 3.  For use in an FPGA we prefer to operate on polynomials over $GF(2^n)$ which means each coefficient of the polynomial is either 0 or 1.  In $GF(2^n)$ the above form simplifies to $$y^2 + x y = x^3 + a x^2 + b$$ where each value, $x, y, a$ and $b$ are polynomials over $GF(2^n)$.

A point $P = (x, y)$ is "on the curve" when we plug in $x$ and $y$ for some values of $a$ and $b$ and the equality is true. There are two values of $y$ for every $x$ because it is a quadratic equation.  When $x = 0$ the two values are the same (since $+\sqrt{b} = -\sqrt{b}$).

A new set is formed by the points on an elliptic curve over a finite field.  Since the field is finite, the number of points on the curve is finite.  When mathematicians play with sets, they try to see if they can operate on the elements in the set.  The first operation is addition.  So what does it mean to add two points in the set?  If we have points $P$, $Q$ and $R$ can it make sense for the operation $$P + Q = R$$ so that all the points are on the curve? If so, then we have a group!

The rules of addition include an "identity element" which we call $0$ for integers.  For elliptic curves the identity element is called "the point at infinity".  So another operation we must have is $$P + \oslash = P$$ This implies that $P - P = \oslash$. So what is the point $-P$?

With the point $P = (x,y)$ in $GF(2^n)$ we take $-P = (x, x+y)$.  Substituting $x+y$ for $y$ in the elliptic curve equation we see that $$(x+y)^2 + x(x+y) = x^2 + y^2 + x^2 + x y = y^2 + x y$$ so the left hand side does not change.  We see we can at least define $P$ and $-P$, but how do we actually add points?

The equations are straight forward - let $P = (x_1, y_1), Q = (x_2, y_2)$ and $R = (x_3, y_3)$. Define $$\theta=\left(\frac {y_2-y_1}{x_2-x_1}\right)$$ then $$\begin{array}{c}x_3=\theta^2 +\theta+x_1+x_2+a\\y_3=\theta(x_1+x_3)+x_3+y_1\end{array}$$

That's nice and it works as long as $P$ and $Q$ are different points. If they are the same point we get $\theta=0/0$ which is useless (or undefined in the mathematical sense).  To add a point to itself is called "doubling" and the formula for that is given by $$\begin{array}{c}\theta=x_1+\frac{x_1}{y_1} \\x_3=\theta^2 +\theta+a\\ y_3=x_1^2+(\theta+1)x_3 \end{array}$$

Now we can make life very confusing.  We have described two uses of the "$+$" sign.  One is to add two polynomials over $GF(2^n)$, and one is to add two points over an elliptic curve. In an attempt to keep things straight, I use capital letters for points ($P$, $Q$, $R$) and lower case letters for polynomials ($x_i$, $y_i$, $a$, etc). When we write $R=P+Q$ this means to execute the above algorithms for the values $(x_1, y_1), (x_2, y_2), (x_3, y_3)$.

Now that we have defined addition, we can extend it to multiplication.  The value $S=nP$ means the point $S$ is the point $P$ added to itself $n$ times. There are efficient ways to perform multiplication using doubling and adding.  For example $10 P = ((2P)2+P)2$.  If you use lookup tables you can do sums of a point and then do factors of 4, 8 or 16 instead of just doubling.  But the point is, we can now do math with a group made up of points on an elliptic curve.

So how is elliptic curve point addition used to exchange a secret key?  First, we need to pick a curve which has special properties.  Since the number of points on the curve is finite, we look for curves which contain a very large prime factor in the number of points on the curve.  For example, if we have a field over $GF(2^{226})$ with prime polynomial $p=\sum_{i=0}^{i=225}x^i$ with $a=x^{58} + x^{32} + x^5$ and $b=x^{220} + x^{213} + x^{200} + x^{199} + x^{192} + x^{186} + x^{184} + x^{176} + x^{174} + x^{173} + x^{171} \\ + x^{160} + x^{157} + x^{145} + x^{141} + x^{128} + x^{126} + x^{125} + x^{121} + x^{120} + x^{119} + x^{116} + x^{115} \\ + x^{110} + x^{100} + x^{96} + x^{93} + x^{92} + x^{88} + x^{87} + x^{80} + x^{64} + x^{63} + x^{60} + x^{55} + x^{52}\\ + x^{50} + x^{48} + x^{46} + x^{44} + x^{40} + x^{32} + x^{30} + x^{29} + x^{26} + x^{25} + x^{24} + x^{23} +x^{22}\\ + x^{20} + x^{15} + x^{13} + x^{12} + x^{11} + x^{10} + x^6 + x^3$ the number of points on the curve is $2 \ast$ 53919893334301279589334030174039267134757445848588903018047088118159 which is indeed a very large prime!

The next thing we do is pick a common base point on the curve which has the order of that large prime factor.  This is actually very easy: pick any random point on the curve and multiply it by all the factors that are not the prime factor.  So in the above example, some random point $G'$ can be doubled $G = 2 G'$ and we have a base point $G$ with the correct order.

What does "Order of a point" mean? If the order of a point $G$ is $p$ then $\oslash = p G$. Or in words, if we add a point to itself the order of the point times, we get the point at infinity.  This is just like a prime modulus.  Only instead of working with integers modulo a prime number, we are working on an elliptic curve with a prime number of points.

The first public key system was described by Diffie and Hellman.  The elliptic curve version of it is similar.  Alice and Bob pick random values $a$ and $b$ (which can be the hash of a pass phrase) and multiply the base point by these secret numbers.  Alice then has a public key $A = a G$ and Bob has public key $B = b G$.  Like all public key systems, there is a secret key which is never shared along with a public key which can be transmitted openly.

After exchanging keys each side then computes the same secret key: Alice computes $S = a B$ and Bob computes $S = b A$.  They are the same since $a B = a b G = b A$.  Once they both have the same secret key they can proceed to use AES for the remainder of their communications.

To crack this system an attacker must attempt to find $a$ from $A$ or $b$ from $B$.  This is called the Elliptic Curve Logarithm Problem.  The best algorithms are exponential with half the field size.  So in the above example, the size of the prime factor is about $10^{67}$ which means the curve security is about $10^{33}$.  To put this in perspective, suppose you can do one elliptic curve point test per nanosecond.  Then it still takes $10^{24}$ seconds to find one secret key. There are over $10^7$ seconds in a year, so that's around $10^{16}$ years, or 100 times the age of the universe!

Elliptic Curve mathematics is straight forward to implement in both FPGA's and embedded processors.  The choice of underlying field depends on the device and level of security required.   A lot of research has been done (especially by PhD students) finding better ways to compute elliptic curves in FPGA's.  This research is continuing because elliptic curve cryptosystems are the most secure at the lowest resource cost of any other public key system.

Elliptic Curve Cryptography is really about getting the low level math for point addition and the middle level math for combining doubling and adding to implement multiplication.  There are many other algorithms which can be implemented than just Diffie-Hellman including digital signatures.  I'll save those for another blog.

[ - ]
Comment by November 15, 2015
Nice explanation.
[ - ]
Comment by November 16, 2015
The choice of F(2^226) is highly unusual, and the use of a composite extension field opens the door to faster attacks via weil descent. Of course, this page is just an example-- but someone might take parameters from it.

One of the NIST parameters would be a better choice; though curves over binary extension fields are becoming less popular with cryptographers due to recent attacks; regardless of how tidy the hardware is.
[ - ]
Comment by November 16, 2015
The nice thing about 226 is that it is 2*113 and 113 is prime. So a composite extension would be the same order as the base security (~2^113). That slows down Weil descent attacks.

I do agree that the NIST curves are a good choice. Many people prefer working over GF(p) directly for more security. So it definitely depends on how secure you need to be. A large field size takes longer, and if there are ways to speed things up 10 orders of magnitude, that still means it takes 10^6 years to crack GF(2^226). That's good enough for me!
[ - ]
Comment by November 16, 2015
227 and 223 are both prime and would have made nicer choices. You may be overestimating the security of 113-bit ECC, just a bit: http://eprint.iacr.org/2015/143/ (though indeed, there are overheads for working with the extension).
[ - ]
Comment by November 16, 2015
Greg,

I don't know how this web page works - I seem to have clicked on something that did not do what I expected and I can't undo it.

You are correct that 227 is prime - that is why I chose GF(2^226) - this is a type 1 optimal normal basis. It is really fast for squaring and multiplying. It is also 2 times a prime, so the Weil descent attack is difficult. Most people are not building public key systems to protect nuclear weapons though, so security beyond 50 years is typically good enough. One does have to take into account the improvement over time of capability, but when 10 orders of magnitude is your buffer, you don't really have much to worry about.
[ - ]
Comment by November 17, 2015
Here is another interesting paper on Weil Descent: https://math.dartmouth.edu/~carlp/WeilDescent-Product.pdf
It says "This lower bound result shows that asymptotically, the Weil descent attacks are almost never computationally efficient." What you are protecting is just as important to consider as how well you protect it. For most practical applications, being overly worried about academic possible attacks interferes with getting a job done. If you are protecting billions of dollars, or nuclear facilities then worrying about academic possible attacks is probably a good idea.

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.