Re: Floating point performance issue

Tim Prince <n8tm@xxxxxxx> · Tue, 20 Dec 2011 07:20:36 -0500

On 12/20/2011 7:01 AM, Vincent Lefevre wrote:
On 2011-12-20 12:48:53 +0100, Dario Saccavino wrote:
In the second program, if 0.5<  f<  1 the values of a and b eventually
become the smallest representable denormal value and never change
afterwards, resulting in a large number of operations involving
denormal numbers.

Yes, I agree (I forgot about that)... except that if f is close enough
to 1, you won't have subnormals and the program will be fast (like in
the case f<= 0.5).

gcc enables FTZ when using SSE and ffast-math (I think the specific
compiler flag is -funsafe-math-optimizations).

Thanks, good to know...

Therefore the flags needed are -msse2 -mfpmath=sse -ffast-math

I would discourage the use of -ffast-math, which can affect generic
code very badly (due to -funsafe-math-optimizations). Isn't there
an option to enable FTZ?

-ffast-math appears to have been made much more sane in current gcc 
versions (e.g. observance of parentheses is on by default).  Back in the 
pre-SSE days over a decade ago, which we are revisiting in this thread, 
the most widely used mathinline.h implementations had a great deal of 
intentional breakage invoked along with -ffast-math.
CPUs introduced this year, such as Sandy Bridge, are designed to handle 
simple underflow situations such as these without serious performance 
degradation.  Like OP, I have a CPU which was introduced over 5 years 
ago, where many of the characteristics are of only historic interest.

--
Tim Prince