Cortex M4 Floating Point Size| page 3

Reply by ●August 3, 20132013-08-03

On Fri, 2 Aug 2013 15:54:47 -0700 (PDT), dp <dp@tgi-sci.com> wrote:

>On Wednesday, July 31, 2013 8:59:39 PM UTC+3, Jim Stewart wrote:
>> ....
>> 
>> Just out of idle curiosity, what kind of an
>> 
>> application might require 64 bit floating point?
>
>Oh more than those which can use 32 bits for sure.
>For example, if you will be DSP-ing (that is, doing lots of MAC),
>32-bit FP is just useless, the 24 bit mantissa begins
>to lose data before you know.

Are FP-DSP processor really doing hardware MAC instructions internally
using 32 bit FP representation ? I very much doubt that.

When doing MACs in software using integer instruction, why would one
use FP for MAC processing ? Use some big 32/64 bit integer/fixed point
accumulator and only convert the final result to floating point for
further processing.

FP add/sub are nasty, since these may require normalization of the
result, in which first must be determined how many bits needs to be
shifted and then shift the mantissa that amount of bits to the left.
Without some hardware support (find-first-bit-set style HW
instruction), this is quite time consuming and cause variable latency.

Thus doing some higher degree polynomials calculations, the
intermediate results should be kept in integer/fixed point format and
only round/truncate the final result to required representation.

Reply by dp ●August 3, 20132013-08-03

On Saturday, August 3, 2013 8:59:40 AM UTC+3, upsid...@downunder.com wrote:
> On Fri, 2 Aug 2013 15:54:47 -0700 (PDT), dp <dp@tgi-sci.com> wrote:
> 
> >On Wednesday, July 31, 2013 8:59:39 PM UTC+3, Jim Stewart wrote:
> >> ....
> >> 
> >> Just out of idle curiosity, what kind of an
> >> 
> >> application might require 64 bit floating point?
> >
> 
> >Oh more than those which can use 32 bits for sure.
> >For example, if you will be DSP-ing (that is, doing lots of MAC),
> >32-bit FP is just useless, the 24 bit mantissa begins
> >to lose data before you know.
>
> Are FP-DSP processor really doing hardware MAC instructions internally
> using 32 bit FP representation ? I very much doubt that.
> 

Don't know about specialized FP DSP-s, never used one. I have been
doing a lot of DSP-ing on a power (PPC) FPU (mostly on an MPC5200B).
It has 32 64-bit FPU regs and can do MAC at both 32 and 64 bits precision.
It takes 1 cycle/32 bit MAC and 2 cycles per 64 bit/MAC.
Reaching that is not straight forward as on a DSP though, there
are data dependencies to take into account. OTOH, having 32 registers
can save a lot of load and store during the filter loop, I managed
the 2 cycles/MAC in a loop at about 10% load/store etc. overhead
penalty. Here is how I did it (VPA macros, self explanatory
enough though):

http://tgi-sci.com/misc/mac8.sa 

Without going through that instead of 5nS/MAC I was getting 30nS/MAC
in a plain loop, to be expected really as the pipeline is 6 stages
IIRC.

> When doing MACs in software using integer instruction, why would one
> use FP for MAC processing ? Use some big 32/64 bit integer/fixed point
> accumulator and only convert the final result to floating point for
> further processing.

Well of course, the thing with "normal" 32 bit processors is that
they do not have 64 bit accumulators and 32 bits is nowhere near
sufficient. 64 bit FP, OTOH, is quite handy. Especially on the
power architecture FPU, where one can read 32 bit FP data
and have these expanded to 64 bits in a single cycle.

> FP add/sub are nasty, since these may require normalization of the
> result, in which first must be determined how many bits needs to be
> shifted and then shift the mantissa that amount of bits to the left.

Last time I had the fun doing this was on a CPU32 (on the 68340),
quite a while ago :-). But the hardware FPU-s on the power
architecture processors are really good at this, somehow they
manage add/sub/mul within a single cycle.

Dimiter

------------------------------------------------------
Dimiter Popoff               Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Reply by Jon Kirwan ●August 4, 20132013-08-04

On Wed, 31 Jul 2013 14:43:58 -0500, Tim Wescott
<tim@seemywebsite.really> wrote:

><snip>
>Most control loops that need any precision won't work quite right with 32 
>bit floating point.  You need more than the 25 bits worth of mantissa 
>that comes with single-precision floating point
><snip>

I know you don't really need the details but:

Most 32-bit FPUs use 8 bits for the signed exponent, one bit
for the sign, and this leaves only 23 bits for the mantissa.
Not 25. (There is also the hidden bit, of course.)

Just being pedantic.

Jon

Previous 1 23Next

Cortex M4 Floating Point Size

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group