On 07/04/16 12:50, Vincent vB wrote:> > On 7-4-2016 at 10:47, Dimiter_Popoff wrote: >> On 07.4.2016 г. 11:41, David Brown wrote: >>> On 07/04/16 10:28, Dimiter_Popoff wrote: >>>> On 07.4.2016 г. 11:21, David Brown wrote: >>>>> ... >>>>> The LatticeMicro 32 compiler is gcc. Use the "-ffast-math" option to >>>>> tell it that you are happy with a bit of obvious code re-arrangements >>>>> rather than insisting on perfect IEEE operation - this lets you >>>>> write "* >>>>> 2.0f / 32768" to clearly express your intent in the code, while the >>>>> /compiler/ turns it into "* (1f/32768f)". >>>>> >>>>> Write your code clearly and correctly, and let the tools do the work. >>>>> >>>>> Then all you need to do is make sure that you give the tools the best >>>>> chance to generate fast code (such as -O2 -ffast-math, and whatever >>>>> LM32 >>>>> flags such as -mbarrel-shift-enabled and -mmultiply-enabled are >>>>> appropriate for your particular cpu). >>>>> >>>>> >>>> >>>> Are you saying this will work without the compiler bringing in >>>> an FP library? >>>> >>> >>> Almost certainly the compiler will bring in parts of its FP library. >> >> The question you replied to was how to do the conversion _without_ >> bringing in an FP library. >> >> Dimiter >> > > I wrote a test, creating the fltFromI16 using floats, as suggested by > expert David Brown. > > I think I wrote my code clearly and correctly and let the /compiler/ do > the work, with the following additional objects as result: > > /libgcc.a(_mul_sf.o) > /libgcc.a(_div_sf.o) > /libgcc.a(_si_to_sf.o) > /libgcc.a(_thenan_sf.o) > /libgcc.a(_muldi3.o) > /libgcc.a(_lshrdi3.o) > /libgcc.a(_clzsi2.o) > /libgcc.a(_pack_sf.o) > /libgcc.a(_unpack_sf.o) > /libgcc.a(_mulsi3.o) > /libgcc.a(_udivmodsi4.o) > /libgcc.a(_clz.o) > > Used GCC flags that matter: > -mbarrel-shift-enabled -mmultiply-enabled > -msign-extend-enabled -Os -ffast-math > > 'The tools' also required 2736 more bytes for the same task than my > highly flawed and inferior code. >Whether your own code is flawed or not depends on whether or not it works, and what coding standards or practices you follow. And whether it is inferior or not depends on your requirements - some would say that baking the scaling constants into a somewhat complicated manually optimised function is inferior, while others would say that requiring a couple of extra KB of library code is inferior. I can't tell you what /you/ need for /your/ design. But I can ask you - are you so short on code space that these 2.7 KB is a significant issue? Or is it just that it "feels wrong" to "waste" this space? If it is the former (which is not unlikely if the code is in "rom" inside a small FPGA), then that's fine. If it is the later, then think about what is really the /best/ solution for you, the project, and the customer.
Integer/Fixedpoint to 32 bit float
Started by ●April 6, 2016
Reply by ●April 7, 20162016-04-07
Reply by ●April 7, 20162016-04-07
On Thu, 07 Apr 2016 09:07:00 +0200, Vincent vB wrote:> On 7-4-2016 at 1:23, Clifford Heath wrote: >> gcc has "int __builtin_clz (unsigned int x)", lso long and long-long >> versions. These map to whatever is most efficient for your hardware. >> >> It's a pity that there's no integer equivalent of ldexp; maybe called >> ldiexp. >> >> To the OP: If your endian-ness and compiler bit-fields work out, you >> can use this (works for me on x64 with gcc) for building and breaking >> float values. >> >> typedef union { >> float f; >> struct { >> uint32_t mantissa:23; >> uint32_t exponent:8; uint32_t sign:1; >> }; >> } FloatU; >>That gives you the template, but not the values to stick therein.>> Note that building a floating point value like this is likely to be >> slower than just saying "(float)l" - with any decent compiler. But it >> will help you understand what's going on. >> >> Clifford Heath. > > Well, its not really a microcontroller. It is a LatticeMico 32 on a > Xilinx FPGA. I think its a big-endian processor, so this may work out. > I've tried doing the (float)l, but then the LM32 compiler really > attempts to convert the integer into a float. Horrible tricks like this > would work (except for the compiler screaming 'murder and fire' as we > say in Dutch): > > uint32_t l = ...; > float f; > *f = (float*)&l;If it's on a Xilinx, and if the processor doesn't have a clz instruction, and if it's really not fast enough to do it "by hand", how about making that functionality out in FPGA-land? You could either do it as an extension to the instruction set (if Lattice makes it easy), or as a peripheral that coughs up an answer somewhere in the memory map when the operand is written somewhere in the memory map. This seems to be a fairly easy thing to do in an FPGA; I'd be tempted to make an entire int-to-float converter "out there" if it really needed to be that fast. -- www.wescottdesign.com
Reply by ●April 7, 20162016-04-07
On 07.4.2016 г. 13:50, Vincent vB wrote:> > On 7-4-2016 at 10:47, Dimiter_Popoff wrote: >> On 07.4.2016 г. 11:41, David Brown wrote: >>> On 07/04/16 10:28, Dimiter_Popoff wrote: >>>> On 07.4.2016 г. 11:21, David Brown wrote: >>>>> ... >>>>> The LatticeMicro 32 compiler is gcc. Use the "-ffast-math" option to >>>>> tell it that you are happy with a bit of obvious code re-arrangements >>>>> rather than insisting on perfect IEEE operation - this lets you >>>>> write "* >>>>> 2.0f / 32768" to clearly express your intent in the code, while the >>>>> /compiler/ turns it into "* (1f/32768f)". >>>>> >>>>> Write your code clearly and correctly, and let the tools do the work. >>>>> >>>>> Then all you need to do is make sure that you give the tools the best >>>>> chance to generate fast code (such as -O2 -ffast-math, and whatever >>>>> LM32 >>>>> flags such as -mbarrel-shift-enabled and -mmultiply-enabled are >>>>> appropriate for your particular cpu). >>>>> >>>>> >>>> >>>> Are you saying this will work without the compiler bringing in >>>> an FP library? >>>> >>> >>> Almost certainly the compiler will bring in parts of its FP library. >> >> The question you replied to was how to do the conversion _without_ >> bringing in an FP library. >> >> Dimiter >> > > I wrote a test, creating the fltFromI16 using floats, as suggested by > expert David Brown. > > I think I wrote my code clearly and correctly and let the /compiler/ do > the work, with the following additional objects as result: > > /libgcc.a(_mul_sf.o) > /libgcc.a(_div_sf.o) > /libgcc.a(_si_to_sf.o) > /libgcc.a(_thenan_sf.o) > /libgcc.a(_muldi3.o) > /libgcc.a(_lshrdi3.o) > /libgcc.a(_clzsi2.o) > /libgcc.a(_pack_sf.o) > /libgcc.a(_unpack_sf.o) > /libgcc.a(_mulsi3.o) > /libgcc.a(_udivmodsi4.o) > /libgcc.a(_clz.o) > > Used GCC flags that matter: > -mbarrel-shift-enabled -mmultiply-enabled > -msign-extend-enabled -Os -ffast-math > > 'The tools' also required 2736 more bytes for the same task than my > highly flawed and inferior code. > > > VincentI spent a few minutes to do it in my vpa (for power architecture), takes 60 bytes, destroys 2 registers. Might be some help, here it is: http://tgi-sci.com/vpaex/i16tofp32.sa <-- source, http://tgi-sci.com/vpaex/i16tofp32.txt <-- vpa list with native code. Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
Reply by ●April 7, 20162016-04-07
On Wed, 06 Apr 2016 15:17:05 +0200, Vincent vB wrote:> Unfortunately the LSM303D produces 16 bit signed integers. So, the > embedded system needs to convert these values to floats. The scaling it > self is quite simple: Values -32768..32767 need to be scaled to [-2,2).If you have 256 kiB to spare, you could use a lookup table. The following should work on any 32-bit architecture. count_bits() can be optimised, possibly to a single instruction. static int count_bits(unsigned int x) { for (int i = 0; i < 32; i++) if (x>>i == 0) return i; abort(); } static void int_to_float(int x, float *result) { if (x == 0) { *(uint32_t*)result = 0; return; } int negative = x < 0; if (negative) x = -x; int exponent = count_bits(x) - 1; // 0 .. 15 uint32_t bits = x << (23 - exponent); bits &= ~(1U<<23); bits |= (exponent + 127U) << 23; if (negative) bits |= (1U<<31); *(uint32_t*)result = bits; }







