Hi Brian and list, > What data types are used in this calculation? floats only offer ~7 > decimals and doubles only ~14 decimals of precision, so there's no way > you can compare past that and expect consistancy, unless you're using > some kind of bignum package. Currently using doubles, but thanks for reminding me about the number of decimals that make sense. > By default calculations on the 387 are done by the hardware in 80 bits > precision, but truncated down to 64 (assuming double types) when moved > out of the registers. There are a number of ways to deal with it, or at > least expose it: > > -ffloat-store will cause gcc to always move intermediate results out of > registers and into memory, which effectively gets rid of the excess > precision at the cost of a speed hit. Progress! Now the program output matching blocks are (O0 -ffloat-store == O1 ffloat-store == O2 ffloat-store) != (O0) != (O1 == O2 == O3) In other words, now the O0 matches 1,2 with the addition of -ffloat-store, even though it still doesn't match the Ox without ffloat-store. Does this suggest to you the mismatching output was due to decimal point differences rather than other problems (aliasing for example)? Also, I didn't mention earlier (did I?) that the program's output when compiled on the Macintosh matched at all optimization levels. (O0 == O1 == O2) (Though the output did not match any output from the program compiled on linux.) Is this possibly b/c the Mac has sse2 (Core 2 Duo) and able to use those instructions which have more meaningful decimal places? If this is the problem, what would be a good way of dealing with it? Throwing away the meaningless decimal digits is okay with me, but avoiding the performance hit that comes with ffloat-store would be nice. Also, it would be nice to not have the output depend on compiler flags. Is there a way to do the float-store equivalent in the program code itself? The goal being to have the program's output when compiled with O0,1,2 match, as it does with -ffloat-store. I've tried using floats only in the what I guess is the key calculation involving the exp(), then casting to double (so that I don't have to modify all the code to be float), but this doesn't result in matching output between O1 and O0. Does the compiler do any recasting of float->double double->float behind the scenes? Another way might be to use doubles, then zero out the least significant bits that a float does not have. Then use these modified doubles in the calculation. ? Thanks again! C.