EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Damned floating point

Started by Tim Wescott February 26, 2016
Am 26.02.2016 um 19:58 schrieb Tim Wescott:
> I assume the answer is "yes", but is it normal behavior in an x86 > processor to get slightly different answers when you add up the same > short vector of doubles at different times? > > (Very slightly different -- basically so that sum1 == sum2 isn't true, > but not noticeable in any other way.)
Sounds like you're using gcc. Read up on "-ffloat-store". That option is required to make gcc a fully-conforming C compiler. With optimisation enabled, gcc will try to store variables in floating-point registers. The problem is that floating-point registers always have 80 bits of precision whereas a regular 'double' variable has just 64. So if you end up comparing a variable that remained in a register all the time, and a variable that was spilled to memory, you'll get a mismatch. Strictly speaking, this is a violation of the C language standard, but gcc probably invokes customary rights here. (Context switches are not an issue. These save all 80 bits if your operating system is remotely sensible. x86 is not THAT bad.) Stefan
On Sat, 27 Feb 2016 04:44:27 -0500, George Neuner
<gneuner2@comcast.net> wrote:

>On Sat, 27 Feb 2016 02:12:52 -0600, Robert Wessel ><robertwessel2@yahoo.com> wrote: > >>OK, I'm baffled. At least three people have mentioned failure to >>store guard bits as a possible source for this error. Unless there's >>a *very* unconventional usage equating the extra bits of precision in >>the 80-bit format used by default in the x87 to "guard bits", it's >>just not so. > >When the compiler needs to spill what it sees to be a "double" value, >it will save only 64 bits even when the x87 is in (default) extended >mode. Unless you deliberately set the x87 into double mode, you don't >just lose a few guard bits, you lose 11 bits of actual value when the >64-bit extended mantissa is truncated to 53 bits for storage. > > >Not all x86 compilers recognize "long double" as an 80-bit value, some >treat it as a synonym for "double". And AFAIK, every x86-64 compiler >treats "long double" = "double" ... they don't use the x87 at all but >rather use XMM instead.
Both true, but not related to either guard bits or context switches. The problem is that the irregular existence of the extra precision causes numerical instability. The lack of 80-bit long double support is mostly limited to compilers targeting Windows. Most *nix x86 implementations do support the longer format.
>>Guard bits are using within a single FP operation to prevent loss of >>precision when the two operands need to be scaled. These are never >>visible in the ISA, and never persist between operations (except in an >>indirect way in that an operation may have a more precise result than >>it would otherwise). > >The extra ~3 bits of guard precision are available as long as the >value remains in an FPU register.
No, they're not - they exist only during the course of an operation (eg. *during* an FADD). The three extra bits are the key to implementing the IEEE accuracy rule (roughly, computed to infinite precision, and then rounded), with a sane amount of hardware (basically the three extra bits). Once the result is back in the register, the three extra bits are gone.
>>Context switches are also non-issues. You can save the state of the >>entire x87 FPU, and restore it later. That stores the contents of all >>80 bits of each FP register. Now it's theoretically possible that an >>OS might manually step through the individual registers, and store >>only (say) 64-bit values for each, but I've never heard of such a >>thing - not only would it be a lot more complex, it would likely be >>quite a bit slower too, especially when the context would need to be >>restored. > >Both Windows and Linux save XMM registers lazily - but they can do >this because there are [many] hidden rename registers. As long as the >CPU doesn't run out of rename registers, it can preserve XMM values >across a context switch without spilling them to memory.
The presence of rename registers does not change the timing of the needed saves, if the OS elects to do lazy saves of the FP registers (and that applies to FPU, MMX, SSE and AVX equally), it uses the task-switched bit in the task state. If set, any attempt to access any non-integer registers will generate a #NM, at which point the OS can do the save of the old state and the restore of the new one. Alternatively the OS could save/restore the FP registers immediately upon the task switch. The advantage of doing it lazily is that you might be able to avoid the work entirely (for example an interrupt handler might run and return without ever accessing any of the FP registers), and you might be able to finish the task switch faster, and allow long FP operations to complete while executing other code. On the flip side when lazy switches do happen, they require handing an interrupt, which is never particularly cheap. There's no way to allow "old" state to remain in any architecturally visible way. Now this sort-of happens with SMT (Hyperthreading), but in that case there's an architectural state for each thread, even though they share all the (rename) registers. Now a CPU might implement more contexts than it can actually execute threads simultaneously, and then effectively store suspended contexts in the rename registers, but that would just appear as more threads, each with a full architectural state, to the OS. No x86, AFAIK, has implemented such a scheme.
>This goes for ALU values also. On x86-64, at least theoretically, it >is possible to context switch without spilling any values at all. In >practice it is CPU model and code dependent [and likely very hard to >achieve in any case].
Again, no. On a context switch the registers have to be saved, and reloaded with the new context. In the case of the FP registers, that can be delayed until they're actually accessed in the new context (which may allow you to avoid saving them at all if the new context never uses them), but one the new context needs them, they need to be swapped. If the context switches are purely cooperative, the OS can decide to not save most of the registers (at the least on x86 you're going to have the save xPC and xSP), and leave any (other) saving and restoring up to the caller. But interrupt driven context switches don't have the luxury, since the application cannot control when they happen.
>I don't know whether x87 state is similarly protected. I don't know >whether the early Pentiums ever did FPU renaming, and since the >Pentium 4 (SSE-2) x87 use has been highly discouraged, so I tend to >doubt that there has been much (or any) enhancement since then.
On 26.02.2016 20:42, Tim Wescott wrote:
>> >>> I assume the answer is "yes", but is it normal behavior in an x86 >>> processor to get slightly different answers when you add up the same >>> short vector of doubles at different times? >>> >>> (Very slightly different -- basically so that sum1 == sum2 isn't true, >>> but not noticeable in any other way.) >>> > > In this case, though, I'm computing score = a + b + c. Strictly in that > order. Twice, with other floating point calculations in between. And > getting different answers. >
Maybe score1 is computed as (a+b)+c, score2 as a+(b+c) (or even (a+c)+b)). I don't know if the compiler is free to rearrange the terms. So you are doing other floating point calculations in between? Do they involve a,b,c as terms or results? How much do score1 and score2 differ? How much do a,b,c differ? Regards Martin
"Tim Wescott" <seemywebsite@myfooter.really> wrote in message 
news:UJednYstNY8hN03LnZ2dnUU7-YudnZ2d@giganews.com...
> On Fri, 26 Feb 2016 13:24:02 -0600, Robert Wessel wrote: > >> On Fri, 26 Feb 2016 12:58:07 -0600, Tim Wescott >> <seemywebsite@myfooter.really> wrote: >> >>>I assume the answer is "yes", but is it normal behavior in an x86 >>>processor to get slightly different answers when you add up the same >>>short vector of doubles at different times? >>> >>>(Very slightly different -- basically so that sum1 == sum2 isn't true, >>>but not noticeable in any other way.) >>> >>>I assume this has something to do with how the floating-point processor >>>state is treated when a calculation is done. Does anyone know the >>>details? >> >> >> Floating point addition is not commutative. Never has been, never will >> be. IOW, (a+b)+c is *not* the same as a+(b+c). >> >> In short its a problem with the rounding of intermediate results. >> Consider a three digit decimal FP system, and you add 100, -100 and .01. >> If you compute (100 + -100)+.01, you'll get .01. If you compute >> (100+.01) + -100, you get 0, since the (100+.01) rounds back to 100. >> It's actually fairly easy to get into a situation (like the one I just >> illustrated) where the result is *entirely* noise. People doing long >> sums (matrix multiplication, for example), often take considerable pains >> to try and minimize the accumulated error, including carefully >> controlling the order in which things are added. >> >> In general, there are few cases where two FP numbers can be meaningfully >> compared for equality. Usually the correct approach is to do something >> like "fabs(a-b) < epsilon", although picking epsilon can be a bit >> tricky. >> >> "What Every Computer Scientist Should Know About Floating-Point >> Arithmetic": >> >> https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html > > In this case, though, I'm computing score = a + b + c. Strictly in that > order. Twice, with other floating point calculations in between. And > getting different answers. > > -- > > Tim Wescott > Wescott Design Services > http://www.wescottdesign.com
Taking a punt here but things I would wonder about: Has something changed the operating set-up of FP unit? Could an interrupt be coming in, saving and restoring FP result?
On 2016-02-27, Robert Wessel <robertwessel2@yahoo.com> wrote:
> On Fri, 26 Feb 2016 23:38:17 +0000 (UTC), Grant Edwards ><invalid@invalid.invalid> wrote: > >> >>To address the context-switch problem you might have to disable >>interrupts or lock the scheduler -- unless you know for a fact that >>your OS preserves guard bits during FPU context switches. > > > OK, I'm baffled. At least three people have mentioned failure to > store guard bits as a possible source for this error. Unless there's > a *very* unconventional usage equating the extra bits of precision in > the 80-bit format used by default in the x87 to "guard bits", it's > just not so.
You're right -- when I wrote "guard bits" I meant the extra bits that get discarded whenan 80-bit FPU register gets stored in a 64-bit register or memory location. I should have referred to them as extra pricision bits or something like that. One would hope that saving and restoring the FPU state would preserve all 80 bits of each value, but I've seen compiler and kernel writers do some pretty stupid things over the years... -- Grant
On 2/27/2016 10:24 AM, Bill Davy wrote:
> "Tim Wescott" <seemywebsite@myfooter.really> wrote in message > news:UJednYstNY8hN03LnZ2dnUU7-YudnZ2d@giganews.com... >> On Fri, 26 Feb 2016 13:24:02 -0600, Robert Wessel wrote: >> >>> On Fri, 26 Feb 2016 12:58:07 -0600, Tim Wescott >>> <seemywebsite@myfooter.really> wrote: >>> >>>> I assume the answer is "yes", but is it normal behavior in an x86 >>>> processor to get slightly different answers when you add up the same >>>> short vector of doubles at different times? >>>> >>>> (Very slightly different -- basically so that sum1 == sum2 isn't true, >>>> but not noticeable in any other way.) >>>> >>>> I assume this has something to do with how the floating-point processor >>>> state is treated when a calculation is done. Does anyone know the >>>> details? >>> >>> >>> Floating point addition is not commutative. Never has been, never will >>> be. IOW, (a+b)+c is *not* the same as a+(b+c). >>> >>> In short its a problem with the rounding of intermediate results. >>> Consider a three digit decimal FP system, and you add 100, -100 and .01. >>> If you compute (100 + -100)+.01, you'll get .01. If you compute >>> (100+.01) + -100, you get 0, since the (100+.01) rounds back to 100. >>> It's actually fairly easy to get into a situation (like the one I just >>> illustrated) where the result is *entirely* noise. People doing long >>> sums (matrix multiplication, for example), often take considerable pains >>> to try and minimize the accumulated error, including carefully >>> controlling the order in which things are added. >>> >>> In general, there are few cases where two FP numbers can be meaningfully >>> compared for equality. Usually the correct approach is to do something >>> like "fabs(a-b) < epsilon", although picking epsilon can be a bit >>> tricky. >>> >>> "What Every Computer Scientist Should Know About Floating-Point >>> Arithmetic": >>> >>> https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html >> >> In this case, though, I'm computing score = a + b + c. Strictly in that >> order. Twice, with other floating point calculations in between. And >> getting different answers. >> >> -- >> >> Tim Wescott >> Wescott Design Services >> http://www.wescottdesign.com > > Taking a punt here but things I would wonder about: > > Has something changed the operating set-up of FP unit? > Could an interrupt be coming in, saving and restoring FP result?
I'm not certain an interrupt would cause this problem. 80 bit floating point is a valid size for calculations. The interrupt code won't know how the floating point registers are being used so it will have to save and restore the full state of the FL unit rather than truncate the values before saving. -- Rick
On Sat, 27 Feb 2016 23:31:38 -0500, rickman <gnuarm@gmail.com> wrote:

>On 2/27/2016 10:24 AM, Bill Davy wrote: >> "Tim Wescott" <seemywebsite@myfooter.really> wrote in message >> news:UJednYstNY8hN03LnZ2dnUU7-YudnZ2d@giganews.com...
<clip>
>> Taking a punt here but things I would wonder about: >> >> Has something changed the operating set-up of FP unit? >> Could an interrupt be coming in, saving and restoring FP result? > >I'm not certain an interrupt would cause this problem. 80 bit floating >point is a valid size for calculations. The interrupt code won't know >how the floating point registers are being used so it will have to save >and restore the full state of the FL unit rather than truncate the >values before saving.
It is very hard for me to understand why someone would need to use floating point instructions in the ISR, thus no need to save and restore the FP stack. Usually there is something seriously wrong with the program architecture if you _have_to_ use FP in ISR.
<upsidedown@downunder.com> wrote in message 
news:afb5dbp4n29fep9p7utnlagi9rpunudmld@4ax.com...
> On Sat, 27 Feb 2016 23:31:38 -0500, rickman <gnuarm@gmail.com> wrote: > >>On 2/27/2016 10:24 AM, Bill Davy wrote: >>> "Tim Wescott" <seemywebsite@myfooter.really> wrote in message >>> news:UJednYstNY8hN03LnZ2dnUU7-YudnZ2d@giganews.com... > > > <clip> > >>> Taking a punt here but things I would wonder about: >>> >>> Has something changed the operating set-up of FP unit? >>> Could an interrupt be coming in, saving and restoring FP result? >> >>I'm not certain an interrupt would cause this problem. 80 bit floating >>point is a valid size for calculations. The interrupt code won't know >>how the floating point registers are being used so it will have to save >>and restore the full state of the FL unit rather than truncate the >>values before saving. > > It is very hard for me to understand why someone would need to use > floating point instructions in the ISR, thus no need to save and > restore the FP stack. > > Usually there is something seriously wrong with the program > architecture if you _have_to_ use FP in ISR. > > >
I agree it is neither nromal nor desirable to use Fp in an ISR but (a) someone else may have written the ISR, and (b) s**t happens, and (c) when you have discounted everything else ...
On 2/28/2016 3:25 AM, upsidedown@downunder.com wrote:
> On Sat, 27 Feb 2016 23:31:38 -0500, rickman <gnuarm@gmail.com> wrote: > >> On 2/27/2016 10:24 AM, Bill Davy wrote: >>> "Tim Wescott" <seemywebsite@myfooter.really> wrote in message >>> news:UJednYstNY8hN03LnZ2dnUU7-YudnZ2d@giganews.com... > > > <clip> > >>> Taking a punt here but things I would wonder about: >>> >>> Has something changed the operating set-up of FP unit? >>> Could an interrupt be coming in, saving and restoring FP result? >> >> I'm not certain an interrupt would cause this problem. 80 bit floating >> point is a valid size for calculations. The interrupt code won't know >> how the floating point registers are being used so it will have to save >> and restore the full state of the FL unit rather than truncate the >> values before saving. > > It is very hard for me to understand why someone would need to use > floating point instructions in the ISR, thus no need to save and > restore the FP stack. > > Usually there is something seriously wrong with the program > architecture if you _have_to_ use FP in ISR.
I'm not going to judge code I haven't seen or requirements I don't know about. Code is code and floating point is just another form of data. -- Rick
upsidedown@downunder.com writes:
> It is very hard for me to understand why someone would need to use > floating point instructions in the ISR, thus no need to save and > restore the FP stack.
I have to wonder what x86 model this is running on and what compiler, since it occurs to me that x86 floating point computation is typically done these days with XMM instructions rather than the x87 (stack based) instruction set. I don't know what that does to the 80 bit intermediate results. I guess Tim has deadlines to meet and is happy to have put the issue behind him with a workaround but it would be interesting to see the assembly code that came out of the compiler from his code.
The 2026 Embedded Online Conference