Re: Floating point performance issue

Vincent Lefevre <vincent+gcc@xxxxxxxxxx> · Fri, 23 Dec 2011 21:02:44 +0100

On 2011-12-20 22:52:12 +0100, David Brown wrote:
> I understand what you are saying here - and I agree that it's very important
> that any such choice is made clear and explicit.  But that's why a program's
> makefile is part of its source code - compiler flags are an essential part
> of the source.

Very often there isn't a static makefile, for instance for all those
programs built via the autotools: the compiler options are provided
by the one who builds the program. Note that there has already been
complaints because some program gave incorrect results due to the
-ffast-math option provided by a 3rd-party (who didn't know that it
shouldn't have been used for this program).

BTW, I've found in my archives the following example I had posted:

float x = 30.0;
int main()
{
  if ( 90.0/x != 3.0)
    abort();
  return 0;
}

fails with -ffast-math (on x86).

Note that more generally, depending on the application, one may be
able to prove that the division is exact (e.g. when computing some
determinant on simple data), so that the != should be perfectly safe
in such a case.

[...]
> The programs would be equally portable if the last few bits of calculations
> varied, or the rounding was different on different machines.

No, even if these last few bits are meaningless, full reproducibility
may be important in some cases, e.g. for debugging, or for checking
the results on a different machine.

> And people do not expect bit-perfect repeatability of floating
> point, especially not across targets.

Even across targets: many people complained that they got different
results on x86 (where extended precision is used) and on other
platforms (as done in the LHC@home project of CERN).

> In my opinion, code that relies on exceptions to spot errors in calculations
> is normally bad code.  You don't do a division and handle divide-by-zero
> errors - you write code so that you never try to divide by zero.  At best,
> such exceptions are an aid during debugging.

In many scientific codes, avoiding such exceptions would mean a loss
of performance. With the IEEE 754 spec, the code may still be correct,
though. That's why the infinities have been introduced in the standard
(otherwise a NaN would have been sufficient for all such exceptions).

> Again, -fassociative-math is not a problem for code that does sensible
> calculations.

Wrong. It is a problem in various codes.

>  It's theoretical only, or for people who want to insist on
> bit-perfect repeatability of their floating point code. The example
> case given in the gcc manual is "(x + 2**52) - 2**52)". Assuming the
> implication is that x is a small number, there are almost no
> real-world circumstances when such code exists.

Many codes will fail when rewriting such math expressions. See for
instance the rint() implementation, and the codes whose goal is to
improve the accuracy of floating-point computations (these codes are
generally based on algorithms like TwoSum/FastTwoSum and Veltkamp's
splitting). For instance, see the IBM Accurate Mathematical Library,
which is part of the glibc.

> And if you really want to add or subtract numbers that differ by 52
> orders of binary magnitude, and are interested in accurate results,
> you don't use "float" or "double" types anyway.

Wrong again. I suggest that you read our "Handbook of Floating-Point
Arithmetic".

GNU MPFR could be used to do reliable FP arithmetic, but it should
not be seen as a replacement for hardware FP arithmetic, when simple
algorithms based on IEEE 754 arithmetic (as described in our book)
could solve problems.

> >>That's probably one of the most common mistakes with floating point
> >>- the belief that two floating point numbers can be equal just
> >>because mathematically they should be.
> >
> >No, the problem is more than that, it's also about the consistency.
> >Assuming that
> >
> >int foo (void)
> >{
> >   double x = 1.0/3.0;
> >   return x == 1.0/3.0;
> >}
> >
> >returns true should not be regarded as a mistake (even though it might
> >be allowed to return false for some reasons, the default should be
> >true).
> >
> 
> You should not be able to rely on code like that giving either true or false
> consistently.  You say yourself it "should return true" but "might return
> false".  The code is therefore useless, and it would be a good idea for the
> compiler to warn about it.

It is not useless. The developer should have a way to control this
(and the ISO C99 standard provides some ways of checking that, though
this is rather limited).

> >No. This bug is due to the extended precision. Floating-point code
> >affected by rounding errors vs discontinuous functions (such as ==)
> >will have problems, whether extended precision is used or not.
> 
> Yes, I understand the nature of the bug mentioned - floating point
> hardware may use more than "double" precision during calculations.
> But I don't see why it should be treated any differently from any
> other mistaken attempts at comparing floating point numbers.

Because in various cases, using == is not a mistake. Really, I expect
that on current machines, 14.0/7.0 == 2.0 evaluate to 1 (true), which
was unfortunately not always the case 30 years ago.

-- 
Vincent Lefèvre <vincent@xxxxxxxxxx> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)