Re: -mfpmath=387 should enable x387 intrinsics on m64

Tim Prince <TimothyPrince@xxxxxxxxxxxxx> · Tue, 10 Jun 2008 19:28:03 -0400

Jiri Hladky wrote:
Hi all,

I have tried to compare sse version of sin() to 387 version of sin() on 
-core duo (Intel(R) Core(TM)2 CPU         T7400  @ 2.16GHz)
-AMD Athlon(tm) 64 X2 Dual Core Processor 4800+

To my surprise
gcc -c -g -m64 -Wall -mfpmath=387 -o sin_387.o sin.c
is still using sse version of sin() from libm.a (and not calling fsin as I 
would expect)

When I use -mfpmath=387 -ffast-math
gcc -c -g -m64 -Wall -mfpmath=387 -ffast-math -o sin_387_fast_math.o sin.c
then I will get 387 version of sin()!!!

This is puzzling me. 

For me it  seems like bug - I would expect -mfpmath=387 to use always 387 
version of sin. And why -ffast-math enables 387 is also unclear to me!?

Testcase is attached. Run it with

make clean && make run 

************Results on Athlon***************************************
time sin_387 0.5
Input: 0x1p-1, Output: 0x1.44e3aefd8ba93p-15
27.99user 0.04system 0:28.36elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+93minor)pagefaults 0swaps
time sin_sse 0.5
Input: 0x1p-1, Output: 0x1.44e3aefd8ba93p-15
28.70user 0.06system 0:29.30elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+93minor)pagefaults 0swaps
time sin_387_fast_math 0.5
Input: 0x1p-1, Output: 0x1.44e3aefe00c04p-15
82.72user 0.08system 1:23.84elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+92minor)pagefaults 0swaps
**************************************************************************
diff sin_387.obj sin_sse.obj
=>files are same!!!

grep fsin *obj* 
=>Hits only sin_387_fast_math*obj*

I have tried with following two versions: of gcc
gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)
gcc version 4.2.1 (SUSE Linux)

Redirecting to gcc-help, where you are more likely to get this 
answered.  I suppose it's something like this:  You don't get in-lined 
trig functions unless you set -ffast-math, possibly on the assumption 
that a library function could take more precautions about corner cases, 
or because the default is to require ERRNO setting.  For example, fsin 
instruction returns the argument rather than its sin when the magnitude 
of the value is too large.  The libm for x86-64 supports only SSE 
functions, as that is the default decreed by the ABI.