gcc 4.6.1 fpmath options produce unexpected results

tomsen_san <thomas.muehlfriedel@xxxxxx> · Wed, 1 Feb 2012 13:11:50 -0800 (PST)



Hi there,
I am running some numerical benchmarking on Ubuntu 11.10 with gcc 4.6.1 on
an Athlon XP (Barton).
I noticed that -O3 -march=athlon-xp -mfpmath=sse|both  produces results
twice as fast for double-precision FP-ops as  -O3 -march=athlon-xp
-mfpmath=387.
I looked at the assembly code and found the reason to be inefficient code in
the latter case. 
For a CPU that does not support SSE2 and above this result is slightly
disconcerting as this makes selecting the correct options for producing fast
code hard. I documented this here: 
http://mandelperformance.blogspot.com/2012/01/magic-gcc-options.html
http://mandelperformance.blogspot.com/2012/01/magic-gcc-options.html  
It would be nice if someone commented the fact that the SSE option produces
perfect 387 code while the 387 option does not  :-)

best regards

-tomsen 
-- 
View this message in context: http://old.nabble.com/gcc-4.6.1-fpmath-options-produce-unexpected-results-tp33245187p33245187.html
Sent from the gcc - Help mailing list archive at Nabble.com.