EmbeddedRelated.com
Forums

RSA on 8bit micro

Started by W Trembyle January 24, 2007
In article <Xns98C26B4581296asldkvnfoweirfjmfjje@127.0.0.1>,
 W Trembyle wrote:

> I'm working with a Cypress PSoC micro with an M8C core (CY8C29666). > I have 32K ROM and 1K RAM. I need to implement 1024 bit RSA encryption > in as small a footprint as possible.
Is the application encryption, or signature verification (like in insuring integrity of code or data)? In the former case, you may need protection against a variety of attacks (timing, DPA, SPA..), when in the later you do not. If your application is DEcryption, or signature GENERATION, you are out of luck with an 8-bit micro, unless you have lots of time; and you likely need protections as suggested above. My company sells optimized assembly code that performs modexp as needed for RSA/Rabin signature verification, including on very low-end micros; and more generally security-related code and counselling. We mostly work on 8051, ST7, 68HC11, PIC18, 680x0, and ARM, but are open to other micros. From the spec of our modexp library: Computes a^e mod n, with 0<e<2^32, and modulus n odd of runtime-specified size, up to and including 2048 bit. 8051: execution time 0.47 s for e=3, 1024 bit, @40Mhz Xtall (3,33 Mcycles/s) 820 bytes of code 768 bytes of contiguous, aligned XDATA RAM includes 256 bytes result but not inputs 6 other contiguous XDATA bytes 2 DATA bytes (only; was a customer requirement) 1 BDATA byte (bit-adressable) 7 bytes of stack, including return address 68HC11: execution time 0.49 s for e=3, 1024 bit, @20MHz Xtall (5 Mhz EClk) 1260 bytes of code 782 bytes of RAM includes 256 bytes result but not inputs 9 bytes of zero-page RAM 9 bytes of stack, including return address Runtime is nearly halved using e=2 (Rabin signature, functionaly equivalent to RSA signature). Francois Grieu
"Vladimir Vassilevsky" <antispam_bogus@hotmail.com> wrote in message 
news:SCPth.75385$wP1.54698@newssvr14.news.prodigy.net...
> > > W Trembyle wrote: > >> >> I still have hope though and will keep trying to find something to make >> the development faster. >> > > RSA is dead simple. > All you have to do is compute pow(A,B)%C where A, B, C are the very long > integer variables. You may have to develop the multiprecision integer > arithmetics, but this should not be a problem either. > > Vladimir Vassilevsky > > DSP and Mixed Signal Design Consultant > > http://www.abvolt.com
For the multi-precision arithmetics, the GMP source code gives (the treatment of natural numbers) some good examples. The algorithms you'll want to use are right out TAOCP, Volume II. -- David T. Ashley (dta@e3ft.com) http://www.e3ft.com (Consulting Home Page) http://www.dtashley.com (Personal Home Page) http://gpl.e3ft.com (GPL Publications and Projects)
In article <Xns98C26B4581296asldkvnfoweirfjmfjje@127.0.0.1>, W Trembyle 
<noreply@nowhere.com> writes
>I'm working with a Cypress PSoC micro with an M8C core (CY8C29666). I have >32K ROM and 1K RAM. I need to implement 1024 bit RSA encryption in as >small a footprint as possible. I do not need to do any key generation in >the micro itself. > >I know that RSA is common on Smart Card micros and secure micros with >similar resorces, but I can't find any good examples.
You won't Smart card things are VERY tightly controlled. Usually in 8 bit smart cards the RSA is handled by a hardware crypto block. You will have to take my word for it as it is very unlikely Philips/NXP, Infineon or Atmel will tell you anything.
>All I find is tons >of Mathamatical and Therotical discussions on RSA, or implementations for a >full PC. >Anyone have any pointers to some source code that I can port over to this >micro. Preferebly in C or Assembly.
Yes.... Http://quest.phaedsys.org Look under the cyphers button at the top. This has two sections. 1 Smart Card cryptography (there is a DES implementation for the 8051 here) 2 The sources for the Applied Cryptography book. This has all the software from the book . Some of it runs on a micro. There is also a section on 8051 and some software for the 51 However be warned I have not updated these pages in some years. -- \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ \/\/\/\/\ Chris Hills Staffs England /\/\/\/\/ /\/\/ chris@phaedsys.org www.phaedsys.org \/\/\ \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
"W Trembyle" <noreply@nowhere.com> wrote in message
news:Xns98C26B4581296asldkvnfoweirfjmfjje@127.0.0.1...
> I'm working with a Cypress PSoC micro with an M8C core (CY8C29666). I
have
> 32K ROM and 1K RAM. I need to implement 1024 bit RSA encryption in as > small a footprint as possible. I do not need to do any key generation in > the micro itself. > > I know that RSA is common on Smart Card micros and secure micros with > similar resorces, but I can't find any good examples. All I find is tons > of Mathamatical and Therotical discussions on RSA, or implementations for
a
> full PC. >
Smartcards use a dedicated 'long integer' coprocessor, and even then it is not blindingly fast or secure. Long time ago there was an article in Dr Dobbs about RSA on a Z80. I remember something like 500 ms for a 512 bit modulus. AFAIR Koc (with c c&#4294967295;dile) has written a lot about the speed of various implementations. Google on: "Chinese Remainder Theorem", Montgomery, Karatsuba. A warning note: naive implementations of RSA can easily reveal the keys to an attacker if the attacker can observe timing or power consumption of the device or if the attacker can manipulate power and or clock signals. See: "Side channel analysis" and "Fault analysis" And try to get access to a good mathematician. When I started working in computer security, I took some courses in algebra and discrete mathematics from the Open University, so I was able to understand the literature, ask the right questions to mathematicians and most important: to understand their answers ;-) Wim
"tum_" <atoumantsev_spam@mail.ru> wrote in message
news:1169674795.297560.219260@k78g2000cwa.googlegroups.com...
> > Your correct. I have 2K RAM and not just 1K. I think I can devote 90%
of
> > this or so to RSA. I can store the key in Flash so no problems there. > > Assembly is the only way to go else you kill yourself in code size. > > Depends on the quality of your compiler, actually. > I'm not familiar with M8C at all, a quick look at the datasheet tells > me that the CPU is quite unusual (I only have Z80 & ARM experience), no > idea if there exist a good C compiler for it. > > > You are also correct in saying that most SmartCards use a cryptographic > > coprocessor. But I know some do not. I have read about implementations > > on 8051 processors and other 8bit micros. I was hopping that someone > > might know an implementation like this. > > > > Speed is no issue, as this is used only for Key exchange of symetric
keys
> > (very infrequently) and firmware update signatures (even more > > infrequently - hopfully :) > > If the speed is not an issue, then the task is definitely doable. "RSA > is dead simple' as Vladimir says, it's just 'x^e mod n' operation and > the only trouble with it is that 'x' & 'n' are big numbers, so if you > have any performance concerns it becomes tricky. > > > > I have done my time in digging through openSSL but have not looked yet
at
> > libtomcrypt. I will check that as well. My problem with these
libraries
> > is that they are designed for PC's with unlimited RAM. Some of there > > structures are larger than my entire RAM. I need some very optomized > > code to make this work. > > I still have hope though and will keep trying to find something to make > > the development faster. > > Well, it's only a few hours since you've posted your request, so you > can still hope . > My feeling is that you will hardly find a ready solution for your > particular CPU. > They probably exist but they are not open. > > In openssl, libtomcrypt and similar libs you'll find the solutions > "optimised for speed" and I perfectly agree that they are hardly > applicable to an 8-bitter with limited memory. > You need to search for a simplistic, straightforward implementation of > RSA, I guess. >
The most understandable implementation I came across was the ancient RSAref. It is written in very portable C
> Do you have assembly experience on the given CPU? >
For performance you have to write only 2 - 4 functions in assembler, mainly : multiply long integer with word.
On Jan 28, 11:34 am, "Wim Ton" <w...@bluewin.ch> wrote:
> Long time ago there was an article in Dr Dobbs about RSA on a Z80. I > remember something like 500 ms for a 512 bit modulus. AFAIR Koc (with c
I've heard about this article but could never find it. Anyway, 500ms=20 for 512 bit doesn't say much unless we know the exponent that was used (and the=20 CPU frequency).
> c=E9dile) has written a lot about the speed of various implementations. > Google on: "Chinese Remainder Theorem", Montgomery, Karatsuba.
Very useful reading indeed. But the OP claims that he's not interested in the performance, he's interested in footprint.
> A warning note: naive implementations of RSA can easily reveal the keys to > an attacker if the attacker can observe timing or power consumption of the > device or if the attacker can manipulate power and or clock signals. See: > "Side channel analysis" and "Fault analysis"
If it's a public key operation, these attacks are irrelevant. It's a pity the OP disappeared from the thread. He's probably started=20 his RSA implementation and it keeps him busy ;-)
> Wim
In article <1170068591.464582.162130@p10g2000cwp.googlegroups.com>,
 "tum_" <atoumantsev_spam@mail.ru> wrote:

>> A warning note: naive implementations of RSA can easily reveal the keys to >> an attacker if the attacker can observe timing or power consumption of the >> device or if the attacker can manipulate power and or clock signals. See: >> "Side channel analysis" and "Fault analysis" > > If it's a public key operation, these attacks are irrelevant.
Fully irrelevant for signature verification; only largely irrelevant for encryption (which also is a public key operation). Francois Grieu
"tum_" <atoumantsev_spam@mail.ru> posted:

"On Jan 28, 11:34 am, "Wim Ton" <w...@bluewin.ch> wrote:
> Long time ago there was an article in Dr Dobbs about RSA on a Z80. I > remember something like 500 ms for a 512 bit modulus. AFAIR Koc (with c
I've heard about this article but could never find it. Anyway, 500ms for 512 bit doesn't say much unless we know the exponent that was used (and the CPU frequency). [..]" Perhaps it was the article documenting 9.5 seconds for 512-bit RSA on a Z80180, Burton S. Kaliski, Jr., "The Z80180 and Big-number Arithmetic: Squeezing 512-bit operations out of 8-bit microcontrollers", WWW.DrDobbs.com/dept/architect/184409071?pgno=19 , date unknown ( WWW.DrDobbs.com recently has had an incorrect date supposedly in 2001 for an article from the 1990s so I do not trust the old dates on WWW.DrDobbs.com ), which I have not read but easily found by entering Z80 RSA into the search bar on WWW.DrDobbs.com .
On Jan 30, 11:15 am, Colin Paul Gloster <Colin_Paul_Glos...@ACM.org> 
wrote:
> "tum_" <atoumantsev_s...@mail.ru> posted: > > "On Jan 28, 11:34 am, "Wim Ton" <w...@bluewin.ch> wrote: > > > Long time ago there was an article in Dr Dobbs about RSA on a Z80. I > > remember something like 500 ms for a 512 bit modulus. AFAIR Koc (with c > > I've heard about this article but could never find it. Anyway, 500ms > for 512 bit > doesn't say much unless we know the exponent that was used (and the > CPU > frequency). > > [..]" > > Perhaps it was the article documenting 9.5 seconds for 512-bit RSA on > a Z80180, Burton S. Kaliski, Jr., "The Z80180 and Big-number > Arithmetic: Squeezing 512-bit operations out of 8-bit > microcontrollers", > WWW.DrDobbs.com/dept/architect/184409071?pgno=19 > , date unknown ( WWW.DrDobbs.com recently has had an incorrect date
Thanks a lot, Colin! I've now got an interesting reading for the evening. (Albeit, not having much practical value for me any more).
> supposedly in 2001 for an article from the 1990s so I do not trust the > old dates on WWW.DrDobbs.com ), which I have not read but easily found > by entering Z80 RSA into the search bar on WWW.DrDobbs.com .
Looks like it's dated 1993 within the article itself. I remember searching for it about 3 years ago and, iirc, there was no _free_ access to this particular article on the Dobbs site and I didn't bother to subscribe and pay at that time.
On Jan 30, 11:15 am, Colin Paul Gloster <Colin_Paul_Glos...@ACM.org> 
wrote:
> "tum_" <atoumantsev_s...@mail.ru> posted: > > "On Jan 28, 11:34 am, "Wim Ton" <w...@bluewin.ch> wrote: > > > Long time ago there was an article in Dr Dobbs about RSA on a Z80. I > > remember something like 500 ms for a 512 bit modulus. AFAIR Koc (with c > > [..] > > Perhaps it was the article documenting 9.5 seconds for 512-bit RSA on > a Z80180, Burton S. Kaliski, Jr., "The Z80180 and Big-number > Arithmetic: Squeezing 512-bit operations out of 8-bit > microcontrollers",
heh, the Z80180 appears to have the MLT instruction performing 8x8 -> 16 bit multiplication in 17 TStates. If I could only have it on plain Z80 core when I worked on this.... The same operation takes about 230-240 tstates on plain Z80, unfortunately. There's only partial code in the article, not the full implementation and I'm a bit puzzled with the sentence "Division takes only a little longer than multiplication..." - I'm a bit in doubt about that but maybe I'm wrong. IIRC, in my experience long division took at least twice as long as multiplication.