-O* Gives Different Results than Individual Optimizations?

Mike Sullivan <mbsullivan@xxxxxxxxx> · Mon, 12 Oct 2009 00:16:11 -0400

A co-worker of mine is having some odd floating-point issues that are
related to the compilation options passed to GCC v4.3.3. Compiling his
program without any optimizations works fine (p_lower should be
exactly 1.0...):

gfortran main.f90 -o main
./main
 p_lower:  1.00000000000000000

Even raising the optimization level to -O1 causes problems:

gfortran main.f90 -o main -O1
./main
 p_lower:  0.96875000000000000

The problem is associated with a different FP rounding causing a <= to
go from one corner case to another. For the purposes of this message,
the magnitude of the difference is irrelevant, as is the fact that the
problem can be fixed by coding it a different way.

I decided to try and figure out which compiler option(s) were
affecting things. I looked at
http://gcc.gnu.org/onlinedocs/gcc-4.3.3/gcc/Optimize-Options.html for
the list of compiler options associated with -O1. However, even using
all of the individual flags that are supposedly in -O1 doesn't cause
the error to occur:

gfortran main.f90 -o main \
    -fauto-inc-dec \
    -fcprop-registers \
    -fdce \
    -fdefer-pop \
    -fdse \
    -fguess-branch-probability \
    -fif-conversion2 \
    -fif-conversion \
    -finline-small-functions \
    -fipa-pure-const \
    -fipa-reference \
    -fmerge-constants \
    -fsplit-wide-types \
    -ftree-ccp \
    -ftree-ch \
    -ftree-copyrename \
    -ftree-dce \
    -ftree-dominator-opts \
    -ftree-dse \
    -ftree-fre \
    -ftree-sra \
    -ftree-ter \
    -funit-at-a-time
./main
 p_lower:  1.00000000000000000

In fact, we can add all optimizations for -O3 and the error doesn't
manifest itself:

gfortran main.f90 -o main \
    -fauto-inc-dec \
    -fcprop-registers \
    -fdce \
    -fdefer-pop \
    -fdse \
    -fguess-branch-probability \
    -fif-conversion2 \
    -fif-conversion \
    -finline-small-functions \
    -fipa-pure-const \
    -fipa-reference \
    -fmerge-constants \
    -fsplit-wide-types \
    -ftree-ccp \
    -ftree-ch \
    -ftree-copyrename \
    -ftree-dce \
    -ftree-dominator-opts \
    -ftree-dse \
    -ftree-fre \
    -ftree-sra \
    -ftree-ter \
    -funit-at-a-time \
    -fthread-jumps \
    -falign-functions \
    -falign-jumps \
    -falign-loops \
    -falign-labels \
    -fcaller-saves \
    -fcrossjumping \
    -fcse-follow-jumps \
    -fcse-skip-blocks \
    -fdelete-null-pointer-checks \
    -fexpensive-optimizations \
    -fgcse \
    -fgcse-lm \
    -foptimize-sibling-calls \
    -fpeephole2 \
    -fregmove \
    -freorder-blocks \
    -freorder-functions \
    -frerun-cse-after-loop \
    -fsched-interblock \
    -fsched-spec \
    -fschedule-insns \
    -fschedule-insns2 \
    -fstrict-aliasing \
    -fstrict-overflow \
    -ftree-pre \
    -ftree-vrp \
    -finline-functions \
    -funswitch-loops \
    -fpredictive-commoning \
    -fgcse-after-reload \
    -ftree-vectorize
./main
 p_lower:  1.00000000000000000

So, my question is: what might be going on to make the "packaged"
optimizations (-O1, -O2, -O3) cause anomalous results, but passing the
individual optimization flags doesn't affect anything? Do I not
understand how the optimization flags work with GCC?

Thank you in advance,
Michael

PS: I omitted -fdelayed-branch because this is running on an x86 machine.