Re: binary compiled with -O1 and w/ individual optimization flags are not the same

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



CSights wrote:

Currently using doubles, but thanks for reminding me about the number of decimals that make sense.

By default calculations on the 387 are done by the hardware in 80 bits
precision, but truncated down to 64 (assuming double types) when moved
out of the registers.  There are a number of ways to deal with it, or at
least expose it:

-ffloat-store will cause gcc to always move intermediate results out of
registers and into memory, which effectively gets rid of the excess
precision at the cost of a speed hit.

	Progress! Now the program output matching blocks are
(O0 -ffloat-store == O1 ffloat-store == O2 ffloat-store) != (O0) != (O1 == O2 == O3) In other words, now the O0 matches 1,2 with the addition of -ffloat-store, even though it still doesn't match the Ox without ffloat-store. Does this suggest to you the mismatching output was due to decimal point differences rather than other problems (aliasing for example)?
It suggests that you were in fact getting more than 53-bit double somewhere, and that it's not an aliasing error.
Also, I didn't mention earlier (did I?) that the program's output when compiled on the Macintosh matched at all optimization levels. (O0 == O1 == O2) (Though the output did not match any output from the program compiled on linux.) Is this possibly b/c the Mac has sse2 (Core 2 Duo) and able to use those instructions which have more meaningful decimal places?
If you use SSE2, you have no extra precision for -ffloat-store to suppress. Assuming the machine where you used 387 has SSE2 hardware, you could set -mfpmath=sse That is the default for 64-bit gcc.
I've tried using floats only in the what I guess is the key calculation involving the exp(), then casting to double (so that I don't have to modify all the code to be float), but this doesn't result in matching output between O1 and O0. Does the compiler do any recasting of float->double double->float behind the scenes?

The 387 exp() performs all its calculations with extra precision. Then, if you don't set -ffloat-store, it may never get rounded down. If you have no SSE2 math library, you will still get 387 exp() even if you set -mfpmath=sse, but there will be an implicit -ffloat-store in the conversion of the result to SSE2.

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux