Jiri Hladky wrote:
Hi all,
I have tried to compare sse version of sin() to 387 version of sin() on
-core duo (Intel(R) Core(TM)2 CPU T7400 @ 2.16GHz)
-AMD Athlon(tm) 64 X2 Dual Core Processor 4800+
To my surprise
gcc -c -g -m64 -Wall -mfpmath=387 -o sin_387.o sin.c
is still using sse version of sin() from libm.a (and not calling fsin as I
would expect)
When I use -mfpmath=387 -ffast-math
gcc -c -g -m64 -Wall -mfpmath=387 -ffast-math -o sin_387_fast_math.o sin.c
then I will get 387 version of sin()!!!
This is puzzling me.
For me it seems like bug - I would expect -mfpmath=387 to use always 387
version of sin. And why -ffast-math enables 387 is also unclear to me!?
Testcase is attached. Run it with
make clean && make run
************Results on Athlon***************************************
time sin_387 0.5
Input: 0x1p-1, Output: 0x1.44e3aefd8ba93p-15
27.99user 0.04system 0:28.36elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+93minor)pagefaults 0swaps
time sin_sse 0.5
Input: 0x1p-1, Output: 0x1.44e3aefd8ba93p-15
28.70user 0.06system 0:29.30elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+93minor)pagefaults 0swaps
time sin_387_fast_math 0.5
Input: 0x1p-1, Output: 0x1.44e3aefe00c04p-15
82.72user 0.08system 1:23.84elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+92minor)pagefaults 0swaps
**************************************************************************
diff sin_387.obj sin_sse.obj
=>files are same!!!
grep fsin *obj*
=>Hits only sin_387_fast_math*obj*
I have tried with following two versions: of gcc
gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)
gcc version 4.2.1 (SUSE Linux)
Redirecting to gcc-help, where you are more likely to get this
answered. I suppose it's something like this: You don't get in-lined
trig functions unless you set -ffast-math, possibly on the assumption
that a library function could take more precautions about corner cases,
or because the default is to require ERRNO setting. For example, fsin
instruction returns the argument rather than its sin when the magnitude
of the value is too large. The libm for x86-64 supports only SSE
functions, as that is the default decreed by the ABI.