EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Cortex M4 Floating Point Size

Started by Tim Wescott July 30, 2013
On Fri, 2 Aug 2013 15:54:47 -0700 (PDT), dp <dp@tgi-sci.com> wrote:

>On Wednesday, July 31, 2013 8:59:39 PM UTC+3, Jim Stewart wrote: >> .... >> >> Just out of idle curiosity, what kind of an >> >> application might require 64 bit floating point? > >Oh more than those which can use 32 bits for sure. >For example, if you will be DSP-ing (that is, doing lots of MAC), >32-bit FP is just useless, the 24 bit mantissa begins >to lose data before you know.
Are FP-DSP processor really doing hardware MAC instructions internally using 32 bit FP representation ? I very much doubt that. When doing MACs in software using integer instruction, why would one use FP for MAC processing ? Use some big 32/64 bit integer/fixed point accumulator and only convert the final result to floating point for further processing. FP add/sub are nasty, since these may require normalization of the result, in which first must be determined how many bits needs to be shifted and then shift the mantissa that amount of bits to the left. Without some hardware support (find-first-bit-set style HW instruction), this is quite time consuming and cause variable latency. Thus doing some higher degree polynomials calculations, the intermediate results should be kept in integer/fixed point format and only round/truncate the final result to required representation.
On Saturday, August 3, 2013 8:59:40 AM UTC+3, upsid...@downunder.com wrote:
> On Fri, 2 Aug 2013 15:54:47 -0700 (PDT), dp <dp@tgi-sci.com> wrote: > > >On Wednesday, July 31, 2013 8:59:39 PM UTC+3, Jim Stewart wrote: > >> .... > >> > >> Just out of idle curiosity, what kind of an > >> > >> application might require 64 bit floating point? > > > > >Oh more than those which can use 32 bits for sure. > >For example, if you will be DSP-ing (that is, doing lots of MAC), > >32-bit FP is just useless, the 24 bit mantissa begins > >to lose data before you know. > > Are FP-DSP processor really doing hardware MAC instructions internally > using 32 bit FP representation ? I very much doubt that. >
Don't know about specialized FP DSP-s, never used one. I have been doing a lot of DSP-ing on a power (PPC) FPU (mostly on an MPC5200B). It has 32 64-bit FPU regs and can do MAC at both 32 and 64 bits precision. It takes 1 cycle/32 bit MAC and 2 cycles per 64 bit/MAC. Reaching that is not straight forward as on a DSP though, there are data dependencies to take into account. OTOH, having 32 registers can save a lot of load and store during the filter loop, I managed the 2 cycles/MAC in a loop at about 10% load/store etc. overhead penalty. Here is how I did it (VPA macros, self explanatory enough though): http://tgi-sci.com/misc/mac8.sa Without going through that instead of 5nS/MAC I was getting 30nS/MAC in a plain loop, to be expected really as the pipeline is 6 stages IIRC.
> When doing MACs in software using integer instruction, why would one > use FP for MAC processing ? Use some big 32/64 bit integer/fixed point > accumulator and only convert the final result to floating point for > further processing.
Well of course, the thing with "normal" 32 bit processors is that they do not have 64 bit accumulators and 32 bits is nowhere near sufficient. 64 bit FP, OTOH, is quite handy. Especially on the power architecture FPU, where one can read 32 bit FP data and have these expanded to 64 bits in a single cycle.
> FP add/sub are nasty, since these may require normalization of the > result, in which first must be determined how many bits needs to be > shifted and then shift the mantissa that amount of bits to the left.
Last time I had the fun doing this was on a CPU32 (on the 68340), quite a while ago :-). But the hardware FPU-s on the power architecture processors are really good at this, somehow they manage add/sub/mul within a single cycle. Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
On Wed, 31 Jul 2013 14:43:58 -0500, Tim Wescott
<tim@seemywebsite.really> wrote:

><snip> >Most control loops that need any precision won't work quite right with 32 >bit floating point. You need more than the 25 bits worth of mantissa >that comes with single-precision floating point ><snip>
I know you don't really need the details but: Most 32-bit FPUs use 8 bits for the signed exponent, one bit for the sign, and this leaves only 23 bits for the mantissa. Not 25. (There is also the hidden bit, of course.) Just being pedantic. Jon

Memfault Beyond the Launch