floating point calculations.

Started by February 10, 2009
```On 10 Feb, 18:11, Tim Wescott <t...@seemywebsite.com> wrote:

> >> 200 * 5.25.
>
> > = 200 * 21 / 4
>
> > It's fastest if you can reduce your problem to integer operations.
> > Please check if you _really_ need floating point operations.
>
> 200 * 21
>
> Right shift the answer by 2.
>
> (remember, this is assembly...)

optimizing compiler should already convert /2^i  to >>i.
Useless to make the code less readable if the compiler optimize in the
right manner. ;)

Bye Jack
```
```Jack wrote:
> On 10 Feb, 18:11, Tim Wescott <t...@seemywebsite.com> wrote:
>
>>>> 200 * 5.25.
>>> = 200 * 21 / 4
>>> It's fastest if you can reduce your problem to integer operations.
>>> Please check if you _really_ need floating point operations.
>> 200 * 21
>>
>> Right shift the answer by 2.
>>
>> (remember, this is assembly...)
>
> optimizing compiler should already convert /2^i  to >>i.
> Useless to make the code less readable if the compiler optimize in the
> right manner. ;)
>

That would be true if he were using a compiler, but he wants to use
assembly (for homework).
```
```Vladimir Vassilevsky wrote:
>
>
> Paul Keinanen wrote:
>
>
>> But as others have said, the best thing in most embedded systems is
>> getting rid of the floating point calculations entirely.
>
> I have to disagree here. Although the floating point math is typically
> somewhat 15 times slower then the native math of 8/16 bitter, it greatly
> simplifies the development and makes the code much more readable and
> portable. As for the code speed and size, it matters only in the few
> cases when it matters.
>

It is not just a matter of 15 times slower - software floating point can
be a great deal slower than that compared to well-designed integer
algorithms, especially on smaller micros.  I still agree with your
principle, however - there is no point in forcing an inherently floating
point algorithm into integer maths if the size and speed of the code is
not important.  Correct code is more important than fast code!
```
```"David Brown" <david@westcontrol.removethisbit.com> wrote in message
news:49927f5b\$0\$14787\$8404b019@news.wineasy.se...
> Jack wrote:
>> On 10 Feb, 18:11, Tim Wescott <t...@seemywebsite.com> wrote:
>>
>>>>> 200 * 5.25.
>>>> = 200 * 21 / 4
>>>> It's fastest if you can reduce your problem to integer operations.
>>>> Please check if you _really_ need floating point operations.
>>> 200 * 21
>>>
>>> Right shift the answer by 2.
>>>
>>> (remember, this is assembly...)
>>
>> optimizing compiler should already convert /2^i  to >>i.
>> Useless to make the code less readable if the compiler optimize in the
>> right manner. ;)

Not true.

> That would be true if he were using a compiler, but he wants to use
> assembly (for homework).

Still not true.

Peter

```
```On Wed, 11 Feb 2009 08:33:54 +0100, David Brown
<david@westcontrol.removethisbit.com> wrote:

>It is not just a matter of 15 times slower - software floating point can
>be a great deal slower than that compared to well-designed integer
>algorithms, especially on smaller micros.

The code becomes slow and bulky on a small micro if you demand full
IEEE-754 compliance :-).

In small systems it can be more practical to use some other format
more suitable for the available simple instruction set. One example
was the 6 byte Turbo-Pascal real data type format with 8 bit exponent
and 40 bit mantissa.

For really small systems a 3 byte format with 8 bit exponent and 16
bit mantissa is often enough and easy to implement. Such format give
about 4-5 significant digits, which is often enough, when the
application is interfacing with the external world with 12-16 bit A/D
and D/A converters.

Paul

```
```Peter Dickerson wrote:
> "David Brown" <david@westcontrol.removethisbit.com> wrote in message
> news:49927f5b\$0\$14787\$8404b019@news.wineasy.se...
>> Jack wrote:
>>> On 10 Feb, 18:11, Tim Wescott <t...@seemywebsite.com> wrote:
>>>
>>>>>> 200 * 5.25.
>>>>> = 200 * 21 / 4
>>>>> It's fastest if you can reduce your problem to integer operations.
>>>>> Please check if you _really_ need floating point operations.
>>>> 200 * 21
>>>>
>>>> Right shift the answer by 2.
>>>>
>>>> (remember, this is assembly...)
>>> optimizing compiler should already convert /2^i  to >>i.
>>> Useless to make the code less readable if the compiler optimize in the
>>> right manner. ;)
>
> Not true.
>
>> That would be true if he were using a compiler, but he wants to use
>> assembly (for homework).
>
> Still not true.
>

Do you mean that it is not the case that the compiler will optimise a
/2^i to a >>i instruction (or instruction sequence)?

*Roughly* speaking, any decent compiler *will* do that optimisation.  If
you want to be pedantic, then the appropriate strength reduction
transformation for the /2^i is dependant on a number of factors,
including the signedness of the operands (signed division involves a
little more in addition to the shift), the cpu in question (maybe it has
a fast divider), other parts of the code (maybe the result can be
pre-calculated, or calculations can be combined or omitted), and so on.
```
```Paul Keinanen wrote:
> On Wed, 11 Feb 2009 08:33:54 +0100, David Brown
> <david@westcontrol.removethisbit.com> wrote:
>
>> It is not just a matter of 15 times slower - software floating point can
>> be a great deal slower than that compared to well-designed integer
>> algorithms, especially on smaller micros.
>
> The code becomes slow and bulky on a small micro if you demand full
> IEEE-754 compliance :-).
>
> In small systems it can be more practical to use some other format
> more suitable for the available simple instruction set. One example
> was the 6 byte Turbo-Pascal real data type format with 8 bit exponent
> and 40 bit mantissa.
>
> For really small systems a 3 byte format with 8 bit exponent and 16
> bit mantissa is often enough and easy to implement. Such format give
> about 4-5 significant digits, which is often enough, when the
> application is interfacing with the external world with 12-16 bit A/D
> and D/A converters.
>

Yes, such extra formats can be very efficient.  However, you'll probably
lose all benefits of having clear and understandable source code (which
is often the reason to choose floating point in the first place), unless
your compiler supports such formats directly.

Avoiding full IEEE-754 is almost always a good idea - embedded systems
(and non-embedded systems) seldom have use for NaNs, etc., in real programs.
```
```"David Brown" <david@westcontrol.removethisbit.com> wrote in message
news:49929f44\$0\$14894\$8404b019@news.wineasy.se...
> Peter Dickerson wrote:
>> "David Brown" <david@westcontrol.removethisbit.com> wrote in message
>> news:49927f5b\$0\$14787\$8404b019@news.wineasy.se...
>>> Jack wrote:
>>>> On 10 Feb, 18:11, Tim Wescott <t...@seemywebsite.com> wrote:
>>>>
>>>>>>> 200 * 5.25.
>>>>>> = 200 * 21 / 4
>>>>>> It's fastest if you can reduce your problem to integer operations.
>>>>>> Please check if you _really_ need floating point operations.
>>>>> 200 * 21
>>>>>
>>>>> Right shift the answer by 2.
>>>>>
>>>>> (remember, this is assembly...)
>>>> optimizing compiler should already convert /2^i  to >>i.
>>>> Useless to make the code less readable if the compiler optimize in the
>>>> right manner. ;)
>>
>> Not true.
>>
>>> That would be true if he were using a compiler, but he wants to use
>>> assembly (for homework).
>>
>> Still not true.
>>
>
> Do you mean that it is not the case that the compiler will optimise a /2^i
> to a >>i instruction (or instruction sequence)?

> *Roughly* speaking, any decent compiler *will* do that optimisation.  If
> you want to be pedantic, then the appropriate strength reduction
> transformation for the /2^i is dependant on a number of factors, including
> the signedness of the operands (signed division involves a little more in
> addition to the shift), the cpu in question (maybe it has a fast divider),
> other parts of the code (maybe the result can be pre-calculated, or
> calculations can be combined or omitted), and so on.

Yes, I want to be pedantic. A shift is not sufficient for signed integers
unless the compiler knows it will be shifting a positive integer. As you
say, there is an extra offset to add for negative integers, plus a test if
you don't know which.

Peter

```
```knightslancer wrote:
> Hey guys thanks for your posts. But, I had to represent data in Q24.8
> format and integer in 32 bit format. using DCD directives only in the data
> section as like:
>

Are you sure they mean floating or do they mean fixed point, like
24-bits used for the integer part and 8-bits for the decimal part (of
course with (1/2)^8 smallest step)? In the fixed point case the problem
is very easy. Just left shift your 32-bit integer by 8 bits, multiply it
by your 32-bit point as is and right shift the 64-bit result by 8 bits.

5.25 would be as a fixed point (24.8) something like

00000000 00000000 00000101 . 01000000  (bit 7 = 1/2, bit 6 = 1/4 etc etc
downto bit 0)

and 200 would be

00000000 00000000 00000000 11001000 === left shift by 8 bits ==>
00000000 00000000 11001000 . 00000000

So, you multiply two 32-bit numbers at the end
00000000 00000000 00000101 01000000
x	00000000 00000000 11001000 00000000
-------------------------------------------
....0	00000100 00011010 00000000 00000000

thus by right shifting by 8-bits

00000000 00000100 00011010 00000000

or as fixed point (24.8)

00000000 00000100 00011010 . 00000000

of which the integer part is 1050 (the upper 24-bit parts) and the
decimal 0 (the lower 8-bit part)

Note that this is for unsigned numbers and with you integer limited to
24-bits (because of the 8-bits)

Best regards
GM

```
```Peter Dickerson wrote:
> "David Brown" <david@westcontrol.removethisbit.com> wrote in message
> news:49929f44\$0\$14894\$8404b019@news.wineasy.se...
>> Peter Dickerson wrote:
>>> "David Brown" <david@westcontrol.removethisbit.com> wrote in message
>>> news:49927f5b\$0\$14787\$8404b019@news.wineasy.se...
>>>> Jack wrote:
>>>>> On 10 Feb, 18:11, Tim Wescott <t...@seemywebsite.com> wrote:
>>>>>
>>>>>>>> 200 * 5.25.
>>>>>>> = 200 * 21 / 4
>>>>>>> It's fastest if you can reduce your problem to integer operations.
>>>>>>> Please check if you _really_ need floating point operations.
>>>>>> 200 * 21
>>>>>>
>>>>>> Right shift the answer by 2.
>>>>>>
>>>>>> (remember, this is assembly...)
>>>>> optimizing compiler should already convert /2^i  to >>i.
>>>>> Useless to make the code less readable if the compiler optimize in the
>>>>> right manner. ;)
>>> Not true.
>>>
>>>> That would be true if he were using a compiler, but he wants to use
>>>> assembly (for homework).
>>> Still not true.
>>>
>> Do you mean that it is not the case that the compiler will optimise a /2^i
>> to a >>i instruction (or instruction sequence)?
>
>> *Roughly* speaking, any decent compiler *will* do that optimisation.  If
>> you want to be pedantic, then the appropriate strength reduction
>> transformation for the /2^i is dependant on a number of factors, including
>> the signedness of the operands (signed division involves a little more in
>> addition to the shift), the cpu in question (maybe it has a fast divider),
>> other parts of the code (maybe the result can be pre-calculated, or
>> calculations can be combined or omitted), and so on.
>
> Yes, I want to be pedantic. A shift is not sufficient for signed integers
> unless the compiler knows it will be shifting a positive integer. As you
> say, there is an extra offset to add for negative integers, plus a test if
> you don't know which.
>

That's why it's best to let the compiler do the optimisation - it will
get it right!

But if you are writing critical code, it's good to know *how* the
compiler will optimise the code, so that you can help it (for example by
choosing signed or unsigned data appropriately).
```