floating point calculations.| page 3

Reply by Paul Keinanen ●February 11, 20092009-02-11

On Wed, 11 Feb 2009 10:51:42 +0100, David Brown
<david@westcontrol.removethisbit.com> wrote:

>Paul Keinanen wrote:
>> On Wed, 11 Feb 2009 08:33:54 +0100, David Brown
>> <david@westcontrol.removethisbit.com> wrote:
>> 

>> For really small systems a 3 byte format with 8 bit exponent and 16
>> bit mantissa is often enough and easy to implement. Such format give
>> about 4-5 significant digits, which is often enough, when the
>> application is interfacing with the external world with 12-16 bit A/D
>> and D/A converters.
>> 
>
>Yes, such extra formats can be very efficient.  However, you'll probably 
>lose all benefits of having clear and understandable source code (which 
>is often the reason to choose floating point in the first place), unless 
>your compiler supports such formats directly.

Use a language with operator overloading, such as C++.

Paul

Reply by David Brown ●February 11, 20092009-02-11

Paul Keinanen wrote:
> On Wed, 11 Feb 2009 10:51:42 +0100, David Brown
> <david@westcontrol.removethisbit.com> wrote:
> 
>> Paul Keinanen wrote:
>>> On Wed, 11 Feb 2009 08:33:54 +0100, David Brown
>>> <david@westcontrol.removethisbit.com> wrote:
>>>
> 
>>> For really small systems a 3 byte format with 8 bit exponent and 16
>>> bit mantissa is often enough and easy to implement. Such format give
>>> about 4-5 significant digits, which is often enough, when the
>>> application is interfacing with the external world with 12-16 bit A/D
>>> and D/A converters.
>>>
>> Yes, such extra formats can be very efficient.  However, you'll probably 
>> lose all benefits of having clear and understandable source code (which 
>> is often the reason to choose floating point in the first place), unless 
>> your compiler supports such formats directly.
> 
> Use a language with operator overloading, such as C++.
> 

That can help significantly with the syntax, but it can still be hard to 
get an optimal implementation.  Depending on your class (or template) 
structure and your compiler, you might end up with significant overhead 
to using the class.  Even if you can arrange for an optimal balance 
between inlining and function calls, you will still not get as efficient 
an implementation as the compiler's native types, because you don't (for 
most compilers) have access to the cpu's flags, and the compiler will be 
unable to do extra optimisation such as pre-calculating values, strength 
reduction (such as changing a division by a constant into a 
multiplication), and efficient register allocation.

Reply by Tim Wescott ●February 11, 20092009-02-11

On Tue, 10 Feb 2009 23:05:20 -0800, Jack wrote:

> On 10 Feb, 18:11, Tim Wescott <t...@seemywebsite.com> wrote:
> 
>> >> 200 * 5.25.
>>
>> > = 200 * 21 / 4
>>
>> > It's fastest if you can reduce your problem to integer operations.
>> > Please check if you _really_ need floating point operations.
>>
>> 200 * 21
>>
>> Right shift the answer by 2.
>>
>> (remember, this is assembly...)
> 
> optimizing compiler should already convert /2^i  to >>i. Useless to make
> the code less readable if the compiler optimize in the right manner. ;)
> 
> Bye Jack

Hard for it to do so when it isn't invoked.

As I said, this is assembly we're talking about.

(and yes, any halfway decent compiler will turn x / 4 into (x >> 2), 
assuming that it is used).

-- 
http://www.wescottdesign.com

Reply by CBFalconer ●February 11, 20092009-02-11

Frank-Christian Kr&#4294967295;gel wrote:
> knightslancer schrieb:
>
... snip ...
>
>> I am trying to do an arithmetic calculations that involve
>> multiplying a 32 bit integer with a floating point numbers.
>> For example:
>>
>>   200 * 5.25.
> 
> = 200 * 21 / 4
> 
> It's fastest if you can reduce your problem to integer operations.
> Please check if you _really_ need floating point operations.

However, ensure that the multiplication doesn't overflow, and don't
allow rearrangement.  I.e., in C, write: (200 * 21) / 4.  That is
after assuring the 200 * 21 won't overflow.  You may need to use
longs for the calculation.

The parentheses above ensure that the computation is not rearranged
in to 200 / 4 * 21.  Check the assembly code.  That should work,
but the compiler may be faulty.  Then you will have to use:

    thing = 200;  /* done by earlier code */
    ...
    /* code to ensure no overflow */
    temp = thing * 21;
    ans  = temp / 4;

-- 
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: <http://cbfalconer.home.att.net>
            Try the download section.

Reply by CBFalconer ●February 11, 20092009-02-11

Paul Keinanen wrote:
> "knightslancer" <knightslancer@gmail.com> wrote:
> 
... snip ...
>
>> How can I write an assembly code for this on ARM7 LPC2292 boards.
>> Please help me out with this. I am a started in assembly language
>> programming.
> 
> If the processor does not have a floating point instruction set,
> just convert the integer to the same floating point notation as
> your floating point numbers (whatever notation you have chosen).
> Doing the actual floating point multiplication is just multiplying
> the significands and adding the exponents and correcting the bias.

In general, when using software floating point, you will find that
addition (or subtraction) is the slowest basic operation, due to
the need to find a common 'size' to inflict on both operands. 
Division is the next slowest, and multiplication the fastest.

-- 
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: <http://cbfalconer.home.att.net>
            Try the download section.

Reply by Paul Keinanen ●February 12, 20092009-02-12

On Wed, 11 Feb 2009 21:47:30 -0500, CBFalconer <cbfalconer@yahoo.com>
wrote:

>Paul Keinanen wrote:
>> "knightslancer" <knightslancer@gmail.com> wrote:
>> 
>... snip ...
>>
>>> How can I write an assembly code for this on ARM7 LPC2292 boards.
>>> Please help me out with this. I am a started in assembly language
>>> programming.
>> 
>> If the processor does not have a floating point instruction set,
>> just convert the integer to the same floating point notation as
>> your floating point numbers (whatever notation you have chosen).
>> Doing the actual floating point multiplication is just multiplying
>> the significands and adding the exponents and correcting the bias.
>
>In general, when using software floating point, you will find that
>addition (or subtraction) is the slowest basic operation, due to
>the need to find a common 'size' to inflict on both operands. 
>Division is the next slowest, and multiplication the fastest.

Floating point addition/subtractions can be quite slow due to the need
to denormalize the smaller operand by shifting it right by up to 24
bits for a 32 bit single precision float and normalizing the
sum/difference by shifting it left by up to 24 bits. 

On an 8 bitter, an initial bulk shift with 8 or 16 bits can be done
with byte moves and then  doing the remaining 1-7 bit shift the
traditional way.

Floating multiply can be faster, if the processor has a decent
8x8=>16, 16x16=>32 or 32x32=>64 bit single cycle unsigned integer
multiply instruction. With only an 8x8=>16 (and 16x16=>16) bit HW
multiplication instruction, nine such multiplications are needed for
the single precision case. Even with 8x8 multiply instructions, it
might still be more effective in doing the 24x24=>24 bit mantissa
multiplication the traditional way by shifts and adds.

Paul

Reply by Grant Edwards ●February 12, 20092009-02-12

On 2009-02-12, CBFalconer <cbfalconer@yahoo.com> wrote:

> In general, when using software floating point, you will find that
> addition (or subtraction) is the slowest basic operation, due to
> the need to find a common 'size' to inflict on both operands. 
> Division is the next slowest, and multiplication the fastest.

I've not found that to be true on any of the platforms I've
benchmarked.  For example, I timed the four operations on a
6800, and add/sub was about 1ms, and mult/div was about 4ms.

-- 
Grant Edwards                   grante             Yow! Please come home with
                                  at               me ... I have Tylenol!!
                               visi.com

Reply by Vladimir Vassilevsky ●February 12, 20092009-02-12

Grant Edwards wrote:

> On 2009-02-12, CBFalconer <cbfalconer@yahoo.com> wrote:
> 
> 
>>In general, when using software floating point, you will find that
>>addition (or subtraction) is the slowest basic operation, due to
>>the need to find a common 'size' to inflict on both operands. 
>>Division is the next slowest, and multiplication the fastest.
> 
> 
> I've not found that to be true on any of the platforms I've
> benchmarked.  For example, I timed the four operations on a
> 6800, and add/sub was about 1ms, and mult/div was about 4ms.

I compared the fixed point math to the emulated floating point on AVR, 
HC12, TMS28xx and BlackFin. For the same control algorithms implemented 
in C/C++, the floating point variant can be expected somewhat 15 times 
slower then the integer. The float add/sub/mul speed is in the same 
ballpark, however the division is much slower, being somewhat x4..x10 of 
the other operations.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by Not Really Me ●February 12, 20092009-02-12

knightslancer wrote:
> Hi Friends,
>
> I have tried a lot about this. I donno if my mind is not good at this.
>
> I am trying to do an arithmetic calculations that involve multiplying
> a 32 bit integer with a floating point numbers. For example:
>
> 200 * 5.25.
>
> How can I write an assembly code for this on ARM7 LPC2292 boards.
> Please help me out with this. I am a started in assembly language
> programming.
>
> Thanks
> knight

You seem to have spawned a bunch of conversations, most of which don't seem 
to help.

We have used conversion to scaled integers to accomplish this on low end 
micros.  This has limitations in that ideally you need to know the number of 
decimal places that you have, or at least the number that are significant.. 
Using you example numbers, and a limit of 2 decimals, you mulitply the float 
by 100 and convert to an integer. 5.25 now becomes 525.

Do the multiplication as shown and you have the answer scaled up by 100, 
(200 * 525 = 105,000).

Simply divide it down by the applicable factors and subtracting intermediate 
values to get the real answer.

105,000 / 100 = 1050 (units)
105,000 - 105,000 = 0, so you are done.

123 * 1.23 = ?
123 * (1.23 * 100) = ?
123 * 123 = 15129
15129 / 100 = 151 (units)
15129 / (151*100) = ?
15129 - 15100 = 29 (non zero, so keep going)
29 / 10 = 2 (tenths)
29 - (2*10) = ?
29 - 20 = 9 (non zero, so keep going)
Next division is by 1, so you are done, the 9 is now hundreths.  Now 
reassemble the pieces.

151 units + 2 tenths + 9 hundreths = 151.29

Scott

Reply by Walter Banks ●February 12, 20092009-02-12

> I compared the fixed point math to the emulated floating point on AVR,
> HC12, TMS28xx and BlackFin. For the same control algorithms implemented
> in C/C++, the floating point variant can be expected somewhat 15 times
> slower then the integer. The float add/sub/mul speed is in the same
> ballpark, however the division is much slower, being somewhat x4..x10 of
> the other operations.

We implemented a fixed point library a couple years ago and were surprised
to find that for transcendental functions that for  the same data sizes 4
byte
float and 8:24 fixed for example the execution time was remarkably similar.
From an application point of view fixed point gave increased precision and
floating point gave larger dynamic range.  As several people have pointed
out
the biggest time issue is normalization on processors that don't have a
barrel
shifter.

Regards,

--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com