Math computing time statistics for ARM7TDMI and MSP430| page 3

Reply by Ulf Samuelsson ●November 18, 20062006-11-18

steve wrote:
> Tilmann Reh wrote:
>> Hello,
>>
>> for an estimation of required computing time I would like to roughly
>> know the time that current controllers need for math operations
>> (addition/subtraction, multiplication, division, and also logarithm)
>> in single and/or double precision floating point format (assuming
>> common compilers).
>>
>> The MCUs in question are ARM7TDMI of NXP/Atmel flavour (LPC2000 or
>> SAM7), and Texas MSP430.
>
> Cycles
>
Here are some AVR figures , IAR Full opt for speed, run in AVR Studio 
simulator
Obviously, you cant compare exactly without using same source
Figures measured in a subroutine after values have been loaded into 
registers.
Only tested one set of data, so I can't say if it is typical or not.

add    173
sub     176
mul    175
div     694
sqrt    2586
log     3255


It looks like the MSP430 and the AVR is about the same speed
at the same clock frequency.
The IAR sqrt/log libraries seems a little bit on the slow side.

> MPS430, 32 bit floats, imagecraft complier, typical cycles
> add 158
> sub 184
> mul 332
> div  620
>
> ARM, keil complier 32 bit floats, typical cycles
> add 53
> sub 53
> mul 48
> div  224
> sqrt 439
> log 435
>
> ARM, GNU complier 32 bit floats, typical cycles
> add 472
> sub 478
> mul 439
> div  652
> sqrt 2387
> log 13,523
>
>
> 8051, keil complier, 32 bit floats, typical cycles
> add 199
> sub 201
> mul 219
> div  895
> sqrt 1117
> log 2006
>
>
> max cycles up to 2x typical

-- 
Best Regards,
Ulf Samuelsson
ulf@a-t-m-e-l.com
This message is intended to be my own personal view and it
may or may not be shared by my employer Atmel Nordic AB

Reply by steve ●November 18, 20062006-11-18

Ulf Samuelsson wrote:

> It looks like the MSP430 and the AVR is about the same speed
> at the same clock frequency.
> The IAR sqrt/log libraries seems a little bit on the slow side.

thanks, thats some more great data, all this should be put on a web
page somewhere, I always thought floating point subroutines were a good
test of a processor

Reply by Robert Adsett ●November 18, 20062006-11-18

In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>, 
bungalow_steve@yahoo.com says...
> 
> Ulf Samuelsson wrote:
> 
> > It looks like the MSP430 and the AVR is about the same speed
> > at the same clock frequency.
> > The IAR sqrt/log libraries seems a little bit on the slow side.
> 
> thanks, thats some more great data, all this should be put on a web
> page somewhere, I always thought floating point subroutines were a good
> test of a processor

Or maybe of library writers.  Of course a performance test should 
probably include a correctness test so log doesn't cheat and always 
return 1.0

Robert

-- 
Posted via a free Usenet account from http://www.teranews.com

Reply by Jonathan Kirwan ●November 18, 20062006-11-18

On 17 Nov 2006 14:53:58 -0800, "steve" <bungalow_steve@yahoo.com>
wrote:

>MPS430, 32 bit floats, imagecraft complier, typical cycles
>add 158
>sub 184
>mul 332
>div  620

Back in 2004, I wanted to play with writing a floating point routine
for the MSP430.  It accepts IEEE format 32-bit floats.  The 32-bit by
32-bit with 32-bit result floating point divide takes roughly 400-435
cycles on the MSP430.  This is substantially less than the 620 cycles
mentioned above.

(I think a lot has to do with how much time various compiler folks
decide to invest in their libraries.  It can be a serious time sink
for a compiler vendor to optimize them for a single processor.  In my
case, I only invested time in writing one routine, the 32fp divide, so
it was fun.  I didn't have to produce all of the routines with the
various combinations of data types which would probably have turned it
into 'real work.')

Jon

Reply by steve ●November 18, 20062006-11-18

Jonathan Kirwan wrote:
> On 17 Nov 2006 14:53:58 -0800, "steve" <bungalow_steve@yahoo.com>
> wrote:
>
> >MPS430, 32 bit floats, imagecraft complier, typical cycles
> >add 158
> >sub 184
> >mul 332
> >div  620
>
> Back in 2004, I wanted to play with writing a floating point routine
> for the MSP430.  It accepts IEEE format 32-bit floats.  The 32-bit by
> 32-bit with 32-bit result floating point divide takes roughly 400-435
> cycles on the MSP430.  This is substantially less than the 620 cycles
> mentioned above.

Probably because you didn't support all the IEEE 754 exception/rounding
modes that compliers support.

Reply by steve ●November 18, 20062006-11-18

Robert Adsett wrote:
> In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>,
> bungalow_steve@yahoo.com says...
> >
> > Ulf Samuelsson wrote:
> >
> > > It looks like the MSP430 and the AVR is about the same speed
> > > at the same clock frequency.
> > > The IAR sqrt/log libraries seems a little bit on the slow side.
> >
> > thanks, thats some more great data, all this should be put on a web
> > page somewhere, I always thought floating point subroutines were a good
> > test of a processor
>
> Or maybe of library writers.

well if library writers have a tough time writing a fast floating point
algorithm for a specific processor, I probably will too!

Reply by Jonathan Kirwan ●November 18, 20062006-11-18

On 18 Nov 2006 14:40:36 -0800, "steve" <bungalow_steve@yahoo.com>
wrote:

>Jonathan Kirwan wrote:
>> On 17 Nov 2006 14:53:58 -0800, "steve" <bungalow_steve@yahoo.com>
>> wrote:
>>
>> >MPS430, 32 bit floats, imagecraft complier, typical cycles
>> >add 158
>> >sub 184
>> >mul 332
>> >div  620
>>
>> Back in 2004, I wanted to play with writing a floating point routine
>> for the MSP430.  It accepts IEEE format 32-bit floats.  The 32-bit by
>> 32-bit with 32-bit result floating point divide takes roughly 400-435
>> cycles on the MSP430.  This is substantially less than the 620 cycles
>> mentioned above.
>
>Probably because you didn't support all the IEEE 754 exception/rounding
>modes that compliers support.

Steve, do you _know for certain_ that the library tested abouve from
Imagecraft does support all of them?  It's been my own experience that
the libraries for floating point don't completely support all types
and exceptions.  Are you sure this is the case here?

In the example I was testing out, I was examining just one compiler
library routine to mimic it's behavior.  I think I captured all the
elements there, but it's probable that the compiler itself has
advanced in the two intervening years and it wasn't Imagecraft's
anyway, so your point may remain a good one to keep in mind.

I believe I wouldn't need another 200 cycles, though, to achieve what
extra is done in compiler libraries.  I'd be very interested in
finishing it up, though, so as to exactly match the features of the
Imagecraft routine you tested with, if provided with a complete
implementation of their 32-bit fp divide for the MSP430 so that I
could personally guarantee that I've met the goal.  Not that this
would prove anything much, except that more time given to informed
effort is better than less time.  Still, I'd do it for the fun of
trying.

Jon

Reply by Robert Adsett ●November 18, 20062006-11-18

steve wrote:
> Robert Adsett wrote:
> > In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>,
> > bungalow_steve@yahoo.com says...
> > >
> > > Ulf Samuelsson wrote:
> > >
> > > > It looks like the MSP430 and the AVR is about the same speed
> > > > at the same clock frequency.
> > > > The IAR sqrt/log libraries seems a little bit on the slow side.
> > >
> > > thanks, thats some more great data, all this should be put on a web
> > > page somewhere, I always thought floating point subroutines were a good
> > > test of a processor
> >
> > Or maybe of library writers.
>
> well if library writers have a tough time writing a fast floating point
> algorithm for a specific processor, I probably will too!

I was suggesting that library performance may rely as heavily on the
writer of the library as it does on the micro.  All things being equal
it may reveal micro performance.  Seldom are all things equal.

Robert

Reply by steve ●November 18, 20062006-11-18

Jonathan Kirwan wrote:

> Steve, do you _know for certain_ that the library tested abouve from
> Imagecraft does support all of them?  It's been my own experience that
> the libraries for floating point don't completely support all types
> and exceptions.  Are you sure this is the case here?
>
No I am not certain, imagecraft claims IEEE floating point, which means
its should be compatible with IEEE 754 so that it runs identical to
IEEE 754
compatible FPU's.

Maybe your MSP430 had the HW multiply?

Reply by Paul Keinanen ●November 19, 20062006-11-19

On Sat, 18 Nov 2006 16:35:15 -0500, Robert Adsett
<subscriptions@aeolusdevelopment.com> wrote:

>In article <1163882513.747840.143520@h54g2000cwb.googlegroups.com>, 
>bungalow_steve@yahoo.com says...
>> 
>> Ulf Samuelsson wrote:
>> 
>> > It looks like the MSP430 and the AVR is about the same speed
>> > at the same clock frequency.
>> > The IAR sqrt/log libraries seems a little bit on the slow side.
>> 
>> thanks, thats some more great data, all this should be put on a web
>> page somewhere, I always thought floating point subroutines were a good
>> test of a processor
>
>Or maybe of library writers.  Of course a performance test should 
>probably include a correctness test so log doesn't cheat and always 
>return 1.0

There are myriads of floating point formats in the world, some of
which may be easier to implement with integer only hardware. Some that
used the IEEE-754 bit layout for sign, exponent and mantissa, might
not support unnormalised (extremely small) values, might not handle
NaNs etc. 

Starting with C99 you had to implement the IEEE/IEC floating point
formats to the letter if the compiler defines __STDC_IEC_599__

Paul