Re: Floating point behavior

Brian Dessent <brian@xxxxxxxxxxx> · Thu, 04 Sep 2008 02:00:59 -0700

eool wrote:

> Basically, the reason given in the FAQ is that the result of cos(x)
> (intermediate, higher precision value) will be stored in a register and thus
> truncated. Then the result of cos(y) is directly compared with the truncated
> value located in a register, resulting in this strange (?) behavior. Now, my
> question is, why isn't gcc able to tell that it is about to compare two
> floating point numbers, one of which has been truncated, and then similarly

You can get this with -ffloat-store.  It's not enabled by default as it
decreases performance because it adds the redundant inefficiency of
having to transfer all the intermediate results of computations from
registers to memory only to just reload them back into registers again. 
Normally the optimizing compiler strives very hard to avoid this very
thing, so this is really a poor workaround to deal with the a quirk of
one particular architecture.

And in general floating point units do not work this way.  It's only the
x87 that has this quirk of performing calculations with extended
precision.  And even with this oddball implementation, the behavior is
still optional: the precision is configurable.  It's just that the
default is 80 bits on many operating systems because there are legacy
math routines that depend on the excess precision for numerical
stability.  But if you aren't dealing with such code you can simply set
the precision in the 387 control word to 64 bits and the problem
vanishes because there is no more 80->64 truncation.  Recent versions of
gcc even have an option (-mpc64) to do this for you at program startup.

An alternative is to avoid the 387 unit and its antiquated 21 year old
warts entirely.  The SSE unit has much more modern and sane design, and
it does calculations on doubles only in double precision.  You can use
-fpmath=sse to tell gcc to avoid the 387 unit and use the sse unit
instead.  Of course this means you have to target an architecture that
supports sse2 (for doubles) and that rules out a fair number of older
systems, notably the AMD line prior to the 64 bit models.  If you're
distributing software in binary form it may not be realistic to assume
sse2 support.  But for x86_64 targets, sse2 is required and is the
default anyway so you don't even have to really worry about it there.

> truncate the second value before the comparison? (In fact, since the return
> type of cos() is double, shouldn't the intermediate value be truncated to
> fit a double in any case?)

The IA-32 calling convention dictates that floating point return values
are passed in the %st(0) register, not on the stack.  So just calling a
function that returns double doesn't necessarily result in any
truncation.  It's only when a value is moved out of a register and
stored into memory that the truncation occurs.

> Second (this question might be even more naive than the previous one), if
> you are never, ever, supposed to compare floating point numbers using ==; if
> you are always supposed to check the difference of the numbers against an
> epsilon value; then how come the compiler can't automatically do this? Is it
> because analyzing the source code to track the accumulated floating point
> errors (to appropriately adjust epsilon) might be prohibitively
> difficult/slow? Or are there actually any situations where you'd want to
> compare two floating point numbers directly with each other?

C is not a language where the compiler has any room for interpretation
of what you specify.  People expect C compilers to generate the code
that's written without second guessing the intent of the programmer. 
Besides, the epsilon value has to be scaled appropriately to the
predicates of the comparison, so it's not like there's one value that
fits all situations.  And as already mentioned, there are existing
workarounds that mitigate or completely avoid the problem.

Again, the excess precision issue is a design quirk of a particular
implementation that, while at one point in time was very widespread, is
mercifully obsolete these days.  Users targeting sparc, power, arm,
mips, alpha (etc.) platforms would all be very unhappy to see their code
changed to work around something that doesn't affect them.

Brian