Re: Floating point performance issue

Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx> · Tue, 20 Dec 2011 13:30:30 +0100

I tested this on a PowerPC 970 so I could get lovely charts from
the Shark.  The problem is much less severe there, but it is
totally obvious the problem is that with the default rounding
mode (round to nearest, tie break even) the denormal sticks
around for > 0.5 .

Therefore the flags needed are -msse2 -mfpmath=sse -ffast-math

I would discourage the use of -ffast-math, which can affect generic
code very badly (due to -funsafe-math-optimizations). Isn't there
an option to enable FTZ?

Dunno about that(*), but you can portably do

fesetround(FE_TOWARDZERO);

and that prevents the problem from occurring as well.

Segher

(*) So I looked it up, gcc/config/i386/crtfastmath.c, the code is
(for x86-64):

#define MXCSR_DAZ (1 << 6)      /* Enable denormals are zero mode */
#define MXCSR_FTZ (1 << 15)     /* Enable flush to zero mode */

  unsigned int mxcsr = __builtin_ia32_stmxcsr ();
  mxcsr |= MXCSR_DAZ | MXCSR_FTZ;
  __builtin_ia32_ldmxcsr (mxcsr);