EmbeddedRelated.com
Forums

Math computing time statistics for ARM7TDMI and MSP430

Started by Tilmann Reh November 17, 2006
steve wrote:
> Tilmann Reh wrote: >> Hello, >> >> for an estimation of required computing time I would like to roughly >> know the time that current controllers need for math operations >> (addition/subtraction, multiplication, division, and also logarithm) >> in single and/or double precision floating point format (assuming >> common compilers). >> >> The MCUs in question are ARM7TDMI of NXP/Atmel flavour (LPC2000 or >> SAM7), and Texas MSP430. > > Cycles >
Here are some AVR figures , IAR Full opt for speed, run in AVR Studio simulator Obviously, you cant compare exactly without using same source Figures measured in a subroutine after values have been loaded into registers. Only tested one set of data, so I can't say if it is typical or not. add 173 sub 176 mul 175 div 694 sqrt 2586 log 3255 It looks like the MSP430 and the AVR is about the same speed at the same clock frequency. The IAR sqrt/log libraries seems a little bit on the slow side.
> MPS430, 32 bit floats, imagecraft complier, typical cycles > add 158 > sub 184 > mul 332 > div 620 > > ARM, keil complier 32 bit floats, typical cycles > add 53 > sub 53 > mul 48 > div 224 > sqrt 439 > log 435 > > ARM, GNU complier 32 bit floats, typical cycles > add 472 > sub 478 > mul 439 > div 652 > sqrt 2387 > log 13,523 > > > 8051, keil complier, 32 bit floats, typical cycles > add 199 > sub 201 > mul 219 > div 895 > sqrt 1117 > log 2006 > > > max cycles up to 2x typical
-- Best Regards, Ulf Samuelsson ulf@a-t-m-e-l.com This message is intended to be my own personal view and it may or may not be shared by my employer Atmel Nordic AB
Ulf Samuelsson wrote:

> It looks like the MSP430 and the AVR is about the same speed > at the same clock frequency. > The IAR sqrt/log libraries seems a little bit on the slow side.
thanks, thats some more great data, all this should be put on a web page somewhere, I always thought floating point subroutines were a good test of a processor
In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>, 
bungalow_steve@yahoo.com says...
> > Ulf Samuelsson wrote: > > > It looks like the MSP430 and the AVR is about the same speed > > at the same clock frequency. > > The IAR sqrt/log libraries seems a little bit on the slow side. > > thanks, thats some more great data, all this should be put on a web > page somewhere, I always thought floating point subroutines were a good > test of a processor
Or maybe of library writers. Of course a performance test should probably include a correctness test so log doesn't cheat and always return 1.0 Robert -- Posted via a free Usenet account from http://www.teranews.com
On 17 Nov 2006 14:53:58 -0800, "steve" <bungalow_steve@yahoo.com>
wrote:

>MPS430, 32 bit floats, imagecraft complier, typical cycles >add 158 >sub 184 >mul 332 >div 620
Back in 2004, I wanted to play with writing a floating point routine for the MSP430. It accepts IEEE format 32-bit floats. The 32-bit by 32-bit with 32-bit result floating point divide takes roughly 400-435 cycles on the MSP430. This is substantially less than the 620 cycles mentioned above. (I think a lot has to do with how much time various compiler folks decide to invest in their libraries. It can be a serious time sink for a compiler vendor to optimize them for a single processor. In my case, I only invested time in writing one routine, the 32fp divide, so it was fun. I didn't have to produce all of the routines with the various combinations of data types which would probably have turned it into 'real work.') Jon
Jonathan Kirwan wrote:
> On 17 Nov 2006 14:53:58 -0800, "steve" <bungalow_steve@yahoo.com> > wrote: > > >MPS430, 32 bit floats, imagecraft complier, typical cycles > >add 158 > >sub 184 > >mul 332 > >div 620 > > Back in 2004, I wanted to play with writing a floating point routine > for the MSP430. It accepts IEEE format 32-bit floats. The 32-bit by > 32-bit with 32-bit result floating point divide takes roughly 400-435 > cycles on the MSP430. This is substantially less than the 620 cycles > mentioned above.
Probably because you didn't support all the IEEE 754 exception/rounding modes that compliers support.
Robert Adsett wrote:
> In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>, > bungalow_steve@yahoo.com says... > > > > Ulf Samuelsson wrote: > > > > > It looks like the MSP430 and the AVR is about the same speed > > > at the same clock frequency. > > > The IAR sqrt/log libraries seems a little bit on the slow side. > > > > thanks, thats some more great data, all this should be put on a web > > page somewhere, I always thought floating point subroutines were a good > > test of a processor > > Or maybe of library writers.
well if library writers have a tough time writing a fast floating point algorithm for a specific processor, I probably will too!
On 18 Nov 2006 14:40:36 -0800, "steve" <bungalow_steve@yahoo.com>
wrote:

>Jonathan Kirwan wrote: >> On 17 Nov 2006 14:53:58 -0800, "steve" <bungalow_steve@yahoo.com> >> wrote: >> >> >MPS430, 32 bit floats, imagecraft complier, typical cycles >> >add 158 >> >sub 184 >> >mul 332 >> >div 620 >> >> Back in 2004, I wanted to play with writing a floating point routine >> for the MSP430. It accepts IEEE format 32-bit floats. The 32-bit by >> 32-bit with 32-bit result floating point divide takes roughly 400-435 >> cycles on the MSP430. This is substantially less than the 620 cycles >> mentioned above. > >Probably because you didn't support all the IEEE 754 exception/rounding >modes that compliers support.
Steve, do you _know for certain_ that the library tested abouve from Imagecraft does support all of them? It's been my own experience that the libraries for floating point don't completely support all types and exceptions. Are you sure this is the case here? In the example I was testing out, I was examining just one compiler library routine to mimic it's behavior. I think I captured all the elements there, but it's probable that the compiler itself has advanced in the two intervening years and it wasn't Imagecraft's anyway, so your point may remain a good one to keep in mind. I believe I wouldn't need another 200 cycles, though, to achieve what extra is done in compiler libraries. I'd be very interested in finishing it up, though, so as to exactly match the features of the Imagecraft routine you tested with, if provided with a complete implementation of their 32-bit fp divide for the MSP430 so that I could personally guarantee that I've met the goal. Not that this would prove anything much, except that more time given to informed effort is better than less time. Still, I'd do it for the fun of trying. Jon
steve wrote:
> Robert Adsett wrote: > > In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>, > > bungalow_steve@yahoo.com says... > > > > > > Ulf Samuelsson wrote: > > > > > > > It looks like the MSP430 and the AVR is about the same speed > > > > at the same clock frequency. > > > > The IAR sqrt/log libraries seems a little bit on the slow side. > > > > > > thanks, thats some more great data, all this should be put on a web > > > page somewhere, I always thought floating point subroutines were a good > > > test of a processor > > > > Or maybe of library writers. > > well if library writers have a tough time writing a fast floating point > algorithm for a specific processor, I probably will too!
I was suggesting that library performance may rely as heavily on the writer of the library as it does on the micro. All things being equal it may reveal micro performance. Seldom are all things equal. Robert
Jonathan Kirwan wrote:

> Steve, do you _know for certain_ that the library tested abouve from > Imagecraft does support all of them? It's been my own experience that > the libraries for floating point don't completely support all types > and exceptions. Are you sure this is the case here? >
No I am not certain, imagecraft claims IEEE floating point, which means its should be compatible with IEEE 754 so that it runs identical to IEEE 754 compatible FPU's. Maybe your MSP430 had the HW multiply?
On Sat, 18 Nov 2006 16:35:15 -0500, Robert Adsett
<subscriptions@aeolusdevelopment.com> wrote:

>In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>, >bungalow_steve@yahoo.com says... >> >> Ulf Samuelsson wrote: >> >> > It looks like the MSP430 and the AVR is about the same speed >> > at the same clock frequency. >> > The IAR sqrt/log libraries seems a little bit on the slow side. >> >> thanks, thats some more great data, all this should be put on a web >> page somewhere, I always thought floating point subroutines were a good >> test of a processor > >Or maybe of library writers. Of course a performance test should >probably include a correctness test so log doesn't cheat and always >return 1.0
There are myriads of floating point formats in the world, some of which may be easier to implement with integer only hardware. Some that used the IEEE-754 bit layout for sign, exponent and mantissa, might not support unnormalised (extremely small) values, might not handle NaNs etc. Starting with C99 you had to implement the IEEE/IEC floating point formats to the letter if the compiler defines __STDC_IEC_599__ Paul