code optimizations and numerical research

p@xxxxxxxxx (Peter Jay Salzman) · Mon, 16 May 2005 10:45:07 -0400

Hi all,

I'm a physicist doing research on quantum gravity.  Although I'm good at
writing programs to solve all kinds of non-linear PDEs, ODEs, and integral
equations, I'm not a computer scientist and not savvy with a lot of lingo
and warnings in the gcc man/info pages.  I'd like to ask for help with code
speed optimization options in gcc.

The most effective optimizations for the code I'm currently writing is:

   -O3 -funroll-loops -march=athlon-xp -ffast-math

That last option, -ffast-math, is what I'd like to ask about.  According to
the man page:

   can result in incorrect output for programs which depend on an exact
   implementation of IEEE or ISO rules/specifications for math functions.

and according to the GCC Complete Reference:

   Certain calculations are made faster by violating some of the ISO and
   IEEE rules.  For example, with this option set it is assumed that no
   negative values are passed to sqrt() and that all floating point values
   are valid.

First and foremost, my code must generate correct output --- this is science
that's helping to lay groundwork for a future of theory of quantum gravity,
not a Lucas Arts SCUMM emulator.  :)

So what happens when the assumptions mentioned above are NOT true?  For
example, in this code:

   int main(void)
   {
      double d = sqrt(-1);
      printf("%f\n", d);

      return 0;
   }

the behavior appears to be the same whether it's compiled with -ffast-math
or not: it simply prints "nan".

I've Googled and Googled, but everything I've found on GCC code
optimizations (like "-fno-signaling-nans") appears to simply quote the man
and info pages.  I'm not finding a "dummy's" guide to picking code
optimization.  Even the GCC Complete Reference book is very skimpy on the
details of code optimization.

The man page claims that "-ffast-math" may produce wrong results for
programs that depend on "an exact implementation of IEEE or ISO
rules/specifications for math functions."

What exactly does this vague sentence mean?

In my code, I do use errno, but it's for my own personal "die" function,
like when the program attempts to open a non-existent parameter file.  I can
easily write errno out of my code.

I also enable and catch certain floating point exceptions which appear to be
disabled by default(!) like

       * FE_DIVBYZERO   division by zero
       * FE_UNDERFLOW   result not representable due to underflow
       * FE_OVERFLOW    result not representable due to overflow
       * FE_INVALID     invalid operation

Unlike errno, which is expendable, I would definitely like to be able to
catch these FPE's.  There's a lot of powers of 10^-34 and 10^-11 and 10^-31
in my code, and even with the equation is scaled, I really do need the
program to come to a grinding halt when anything becomes inf or nan or with
underflows/overflows.

Lastly, after some experimentation, I found that it's actually the
combination of "-fno-math-errno -funsafe-math-optimizations" (both enabled
by -ffast-math) that really makes my code fly.  I'm talking about a factor
of almost 400%!!!

But, here again, the documentation describes what
-funssafe-math-optimizations does (violate IEEE and ANSI standards, assumes
values are correct, optimizes the operation of the hardware FPU in
non-standard ways) but doesn't tell me what I really care about: under what
situations (that an educated layman like I will understand) will this option
generate incorrect output?

The number of optimization options is dizzying.  Any help would be greatly
appreciated!

Thanks!!!
Pete

-- 
Every theory is killed sooner or later, but if the theory has good in it,
that good is embodied and continued in the next theory. -- Albert Einstein

GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E  70A9 A3B9 1945 67EA 951D