On Wed, 18 Feb 2015 18:12:20 +0100, Hans-Bernhard Br�ker <HBBroeker@t-online.de> wrote:>Am 18.02.2015 um 17:52 schrieb glen herrmannsfeldt: > >> Otherwise, an older favorite was the Cray machine with non-commutative >> multiply. A*B-B*A might not be zero. > >If memory serves Seymour Cray also gained some notoriety for building >machines where A*1 might not be equal to A.While I'm not sure A*1 was ever not equal to A, there were certainly cases where A*B was not equal to B*A.
two fpu compared
Started by ●February 12, 2015
Reply by ●February 18, 20152015-02-18
Reply by ●February 18, 20152015-02-18
On 18.2.2015 г. 19:18, Dimiter_Popoff wrote:> On 18.2.2015 г. 18:52, glen herrmannsfeldt wrote: >....>....>> >> >> Given a 53 bit quotient of two integers, can you find the correct 32 >> bit integer quotient? > > Hmmmm, you make me scratch my head. I think yes, using the correct > rounding modes etc., but I would not claim anything without > thinking about it in "doing work" mode, which I can't at the > moment (head busy doing other things).LOL, head must have been busy with other nonsense indeed. Of course you can. On the power FPU - if it is a 32 bit power - you can get the correct integer quotient if it fits into 32 bits (signed, so 31 bits really). All one has to do is the divide, then convert to integer rounding towards zero (there is such an opcode); for the remainder, subtract and multiply will do it. On 64 bit power I think the limit is 64 (63) bits but I am not sure, yet to lay my hands on a 64 bit beast. My head would have to work a bit more about the limits/data loss re the remainder though, some other time :D . Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
Reply by ●February 19, 20152015-02-19
On 19.2.15 02:02, Robert Wessel wrote:> On Wed, 18 Feb 2015 18:12:20 +0100, Hans-Bernhard Br�ker > <HBBroeker@t-online.de> wrote: > >> Am 18.02.2015 um 17:52 schrieb glen herrmannsfeldt: >> >>> Otherwise, an older favorite was the Cray machine with non-commutative >>> multiply. A*B-B*A might not be zero. >> >> If memory serves Seymour Cray also gained some notoriety for building >> machines where A*1 might not be equal to A. > > > While I'm not sure A*1 was ever not equal to A, there were certainly > cases where A*B was not equal to B*A. >The classical example is 10.0 * 0.1. It is due to 0.1 having an infinitely long binary fraction (001100110011....). -- -TV
Reply by ●February 19, 20152015-02-19
On 20.02.2015 01:04, Robert Wessel wrote:> On Thu, 19 Feb 2015 09:42:50 +0200, Tauno Voipio > <tauno.voipio@notused.fi.invalid> wrote: > >> On 19.2.15 02:02, Robert Wessel wrote: >>> On Wed, 18 Feb 2015 18:12:20 +0100, Hans-Bernhard Br�ker >>> <HBBroeker@t-online.de> wrote: >>> >>>> Am 18.02.2015 um 17:52 schrieb glen herrmannsfeldt: >>>> >>>>> Otherwise, an older favorite was the Cray machine with non-commutative >>>>> multiply. A*B-B*A might not be zero. >>>> >>>> If memory serves Seymour Cray also gained some notoriety for building >>>> machines where A*1 might not be equal to A. >>> >>> >>> While I'm not sure A*1 was ever not equal to A, there were certainly >>> cases where A*B was not equal to B*A. >>> >> >> The classical example is 10.0 * 0.1. It is due to 0.1 having an >> infinitely long binary fraction (001100110011....). > > > That's a general issue with FP. And on most machines you're going to > get the same (slightly off) result if you do 10*.01 or .01*10. > > We're talking about some early Cray machines, which had a tendency to > curdle a few low bits of various FP operations, so that while 10*.01 > and .01*10 would still not produce 1.0 exactly, they'd produce > slightly *different* values. > > Numerical analysis on the early Crays was a contact sport. >I remember that one version had PI wrong in their Fortran real*16 routines. A colleague found this when his calculation differed between Cray and IBM-370. -- Reinhardt
Reply by ●February 19, 20152015-02-19
Dimiter_Popoff <dp@tgi-sci.com> wrote: (snip, I wrote)>> I haven't thought of the fine details recently, but I believe that >> if you have round to nearest floating point, it is not so easy to >> get the appropriate truncated integer quotient.> Absolutely correct of course. Setting the FPU rounding mode "to zero" > would solve this (if available, otherwise it would take some work). > I got bitten not so long ago by a similar, simpler error I had made; > instead of using "convert FP to integer and round to zero" (there is > such a power architecture opcode) I had used just "move FP to integer". > The latter rounds to nearest and I had to locate and fix it.... :-).(and I also wrote)>> Given a 53 bit quotient of two integers, can you find the correct 32 >> bit integer quotient?> Hmmmm, you make me scratch my head. I think yes, using the correct > rounding modes etc., but I would not claim anything without > thinking about it in "doing work" mode, which I can't at the > moment (head busy doing other things).The assumption of round to nearest from above was supposed to apply to this case. Traditionally, you got what the hardware gave you. IBM S/360 and successor hexadecimal floating point truncates on division (except on the 360/91 where it rounds). Many other processors from before IEEE 754 round only. For discussion of double rounding, see: https://en.wikipedia.org/wiki/Rounding#Double_rounding -- glen
Reply by ●February 19, 20152015-02-19
On Thu, 19 Feb 2015 09:42:50 +0200, Tauno Voipio <tauno.voipio@notused.fi.invalid> wrote:>On 19.2.15 02:02, Robert Wessel wrote: >> On Wed, 18 Feb 2015 18:12:20 +0100, Hans-Bernhard Br�ker >> <HBBroeker@t-online.de> wrote: >> >>> Am 18.02.2015 um 17:52 schrieb glen herrmannsfeldt: >>> >>>> Otherwise, an older favorite was the Cray machine with non-commutative >>>> multiply. A*B-B*A might not be zero. >>> >>> If memory serves Seymour Cray also gained some notoriety for building >>> machines where A*1 might not be equal to A. >> >> >> While I'm not sure A*1 was ever not equal to A, there were certainly >> cases where A*B was not equal to B*A. >> > >The classical example is 10.0 * 0.1. It is due to 0.1 having an >infinitely long binary fraction (001100110011....).That's a general issue with FP. And on most machines you're going to get the same (slightly off) result if you do 10*.01 or .01*10. We're talking about some early Cray machines, which had a tendency to curdle a few low bits of various FP operations, so that while 10*.01 and .01*10 would still not produce 1.0 exactly, they'd produce slightly *different* values. Numerical analysis on the early Crays was a contact sport.
Reply by ●February 21, 20152015-02-21
On 19.2.2015 г. 18:29, glen herrmannsfeldt wrote:> Dimiter_Popoff <dp@tgi-sci.com> wrote: > > ..... >>> Given a 53 bit quotient of two integers, can you find the correct 32 >>> bit integer quotient? > >> Hmmmm, you make me scratch my head. I think yes, using the correct >> rounding modes etc., but I would not claim anything without >> thinking about it in "doing work" mode, which I can't at the >> moment (head busy doing other things). > > The assumption of round to nearest from above was supposed to apply > to this case. > > Traditionally, you got what the hardware gave you. > > IBM S/360 and successor hexadecimal floating point truncates > on division (except on the 360/91 where it rounds). > > Many other processors from before IEEE 754 round only. > > For discussion of double rounding, see: > > https://en.wikipedia.org/wiki/Rounding#Double_roundingHmmmm, while I did not know what they call "double rounding" obviously I knew what it is, as well as the obvious consequences. On the power FPU there is the "round towards zero" mode, though. Meaning the infinitely precise result absolute value will always be rounded down. IOW it will not be rounded after the operation (at least this is how I understand it, seems clear enough). So doing divide such that the integer part of the quotient will fit in the 53 bits - or let's say in 31 bits for simplicity - will always be precise (unrounded). Once one turns "round to nearest" on (which is how I keep it most if not all of the time) obviously prior to recording the result in the FP register they will add a 1 to the bit just below the LSB and all of the problems you point to will apply. I am not that frightened by that sort of thing because I write in VPA, with the register model in mind, and if I begin to scratch my head about how practical it is to use FP for something I will just not use it, or will take into account the applicable rounding etc. side effects. On the example I gave earlier (the netmca-3) I used the FPU as a DSP, doing plenty of MAC and other messy calculations, where the LSB of the result (used up to 14 bits though more are available) is thinner than the noise of the incoming signal I don't remember even having to care much about rounding. I have had my moments with it, obviously, but I don't remember when and doing what .... :-) . Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
Reply by ●February 22, 20152015-02-22
On Fri, 13 Feb 2015 16:23:15 +0100, David Brown wrote:> What this all boils down to is that the client wants to be sure that the > code is "correct" - but has an unreasonable and restrictive definition of > "correct".Or maybe they just want to be sure that they have a *precise* specification in case they ever need to source perfectly-compatible parts, or perform simulations or analysis whose results aren't rendered meaningless by "minor" differences between implementations. There are situations where you need different implementations to be able to perform identical calculations on identical inputs and obtain *identical* results, down to the least-significant bit. In such cases, whether the results are "correct" is often less relevant than whether they're consistent. Key things to avoid are unsafe optimisations (-ffast-math etc, i.e. anything which allows the compiler to pretend that floats observe the same rules as reals), extended intermediate precision (use -ffloat-store and/or -fexcess-precision=standard), fused multiply-add (use -ffp-contract=off and/or -mno-fused-madd), and transcendental functions (these invariably have to be implemented in software or avoided altogether).







