Re: pragma GCC optimize prevents inlining

David Brown <david.brown@xxxxxxxxxxxx> · Sat, 6 Jan 2024 18:02:45 +0100

On 05/01/2024 19:19, Segher Boessenkool wrote:
On Fri, Jan 05, 2024 at 04:53:35PM +0100, David Brown via Gcc-help wrote:
-ffast-math is allowed to introduce any rounding error it wants.  Which
can (in a loop for example) easily introduce unlimited rounding error,
bigger than the actual result.  And this is not just theoretical either.

Normal maths mode can also lead to rounding errors that can build up -
the fact that rounding is carefully specified with IEEE does not mean
there are no errors (compared to the theoretical perfect real-number
calculation).

That's not the point.  A program can be perfectly fine, with bounded
errors and all, and then -ffast-math will typically completely destroy
all that, and replace all arithmetic by the equivalent of a dice roll.

The only difference between IEEE calculations and -ffast-math 
calculations is that with IEEE, the ordering and rounding is controlled 
and consistent.  For any given /single/ arithmetic operation that is 
performed, each can have the same amount of rounding error or error due 
to the limited length of the mantissa.  Agreed?

If you have a /sequence/ of calculations using IEEE, then the order of 
the operations and the types of roundings and other errors will be 
defined and consistent.  It won't change if you change options, 
optimisations, compilers, targets.  It won't change if you make changes 
to the source code that should not affect the result.

So if you do extensive and careful analysis about possible maximum 
errors, and wide-ranging testing of possible inputs, you be confident of 
the accuracy of the results despite the inherent rounding errors.

If you have the same code, but use -ffast-math, then the order the 
calculations will be done may change unexpected, or they can be combined 
or modified in certain ways.  You don't have the consistency.

If you do extensive worst-case analysis and testing, you can be 
confident in the accuracy of the results.

The reality is that usually, people don't do any kind of serious 
analysis.  Of course /some/ will do so, but most people will not.  They 
will not think about the details - because they don't have to.  They 
will not care whether they write "(a + b) + c" or "a + (b + c)", or 
which the compiler does first.  It does not matter if the results are 
consistent - they are using types with far more accuracy than they need, 
and rounding on the least significant bits does not affect them.  They 
don't care - and perhaps don't even know - how precision can be lost 
when adding or subtracting numbers with significantly different 
magnitudes - because they are not doing that, at least when taking into 
account the number of bits in the mantissa of a double.

IEEE ordering is about consistency - it is not about correctness, or 
accuracy.  Indeed, it is quite likely that associative re-arrangements 
under -ffast-math give results that are closer to the mathematically 
correct real maths calculations.  (Some other optimisations, like 
multiplying by reciprocals for division, will likely be slightly less 
accurate.)   I fully appreciate that consistency is often important, and 
can easily be more important than absolute accuracy.  (I work with 
real-time systems - I have often had to explain the difference between 
"real time" and "fast".)

No matter how you are doing your calculations, you should understand 
your requirements, and you should understand the limitations of floating 
point calculations - as IEEE or -ffast-math.  It is reasonable to say 
that you shouldn't use -ffast-math unless you know it's okay for your 
needs, but I think that applies to any floating point work.  (Indeed, it 
is also true for integers - you should not use an integer type unless 
you are sure its range is suitable for your needs.)

But it is simply wrong to suggest that -ffast-math is inaccurate and the 
results are a matter of luck, unless you also consider IEEE maths to be 
inaccurate and a matter of luck.

That has nothing to do with the fact that all floating point arithmetic
is an approximation to real arithmetic (arithmetic on real numbers).
The semantics of 754 (or any other standard followed) make it clear what
the exact behaviour should be, and -ffast-math tells the compiler to
ignore that and do whatever instead.  You cannot have reasonable
programs that way.

That's not what "-ffast-math" does.  I really don't understand why you 
think that.  It is arguably an insult to the GCC developers - do you 
really think they'd put in an option in the compiler that is not merely 
useless, but is deceptively dangerous and designed specifically to break 
people's code and give them incorrect results?

"-ffast-math" is an important optimisation for a lot of code.  It makes 
it a great deal easier for the compiler to use things like SIMD 
instructions for parallel calculations, since there is no need to track 
faults like overflows or NaN signals.  It means the compiler can make 
better use of limited hardware - there is a /lot/ of floating point 
hardware around that is "-ffast-math" compatible but not IEEE 
compatible.  That applies to many kinds of vector and SIMD units, 
graphics card units, embedded processors, and other systems that skip 
handling of infinities, NaNs, signals, traps, etc., in order to be 
smaller, cheaper, faster and lower power.

The importance of these optimisations can be seen in that "-ffast-math" 
was included in the relatively new "-Ofast" flag.  And the relatively 
new "__builtin_assoc_barrier()" function exists solely for use in 
"-ffast-math" mode (or at least, "-fassociative-math").  This shows to 
me that the GCC developers see "-ffast-math" as important, relevant and 
useful, even if it is not something all users want.

The rounding errors in -ffast-math will be very similar to those in IEEE
mode, for normal numbers.

No, not at all.  Look at what -fassociative-math does, for example.
This can **and does** cause the loss of **all** bits of precision in
certain programs.  This is not theoretical.  This is real.

	a = 1e120;
	b = 2;

	x = (a + b) - a;

IEEE rules will give "x" equal to 1e120 - mathematically /completely/ 
wrong.  -ffast-math will give "x" equal to 2, which is mathematically 
precisely correct.

Sometimes -fassociative-math will give you worse results, sometimes it 
will give you better results.  /Not/ using it can, and will, lead to 
losing all bits of precision.  That is equally real.

The simple matter is that if you want good results from your floating 
point, you need to have calculations that are appropriate for your 
inputs - or inputs that are appropriate for your calculations.  That 
applies /equally/ whether you use -ffast-math or not.

The -ffast-math flag can only reasonably be used with programs that did
not want any specific results anyway.  It would be even faster (and just
as correct!) to always return 0.

That is simply wrong.

If you still don't understand what I am saying, then I think this 
mailing list is probably not the best place for such a discussion 
(unless others here want to chime in).  There are no doubt appropriate 
forums where experts on floating point mathematics hang out, and can 
give far better explanations that I could - but I don't know where. 
This is not something that interests me enough - I know enough to be 
fully confident in the floating point I need for my own uses, and fully 
confident that "-ffast-math" gives me what I need with more efficient 
results than not using it would.  I know enough to know where my limits 
are, and when I would need a lot more thought and analysis, or outside 
help and advice.

David