Re: GCC's -ffast-math behavior

Miles Bader <miles@xxxxxxx> · Fri, 17 Feb 2012 10:19:50 +0900

2012/2/17 James Cloos <cloos@xxxxxxxxxxx>:
>>>>>> "MB" == Miles Bader <miles@xxxxxxx> writes:
>
>>> But in my experience, -mfpmath=sse will slow my code very much.
>
> MB> Hmm, I've always found SSE FP to be a speedup -- sometimes a _big_
> MB> speedup -- over 387 FP, at least when one is using mostly primitive
> MB> FP operations (mul, divide, sqrt, etc) ... I think it's worth
> MB> testing, at least.
>
> Many years ago, when I asked about using -fpmath=sse on an ia32 box, the
> advice was that, because the function args and return values had to be
> passed on the 387 stack, most code would be much slower.

I suppose it depends on the actual content of the functions whether
that would be a significant factor.

In general, I'd think there shouldn't be a whole lot of
function-calling going on in the inner loop unless the function in
question actually do something non-trivial (I think this is especially
true for a lot of FP-intensive coding styles, where somewhat more
attention is paid to throughput, and a bit less to things like
abstraction), and the more a function does, the less impact the
function call itself has.  So a speed increase in primitive operations
should make up for some extra per-call overhead.

> Some of the new chips seem to have specific optimizations to deal with
> code which constantly moves values between registers and the stack, so
> it is probably less of an issue on newer chips than it used to be.

My earlier observation is based on benchmarks mostly on P3-era CPUs
(the last time I used the traditional x86 abi much).  I dunno how
representative that is...

> But if one is using a newer chip, why not upgrade to -m64, too?

Totally :]

-miles

-- 
Cat is power.  Cat is peace.