Re: pragma GCC optimize prevents inlining

David Brown via Gcc-help <gcc-help@xxxxxxxxxxx> · Fri, 5 Jan 2024 16:53:35 +0100

On 05/01/2024 16:00, Segher Boessenkool wrote:
On Fri, Jan 05, 2024 at 03:24:48PM +0100, David Brown via Gcc-help wrote:
On 04/01/2024 17:55, Segher Boessenkool wrote:
Most things work on function basis; almost nothing works per RTL
instruction.  There is no per-instruction representation for -fwrapv
in the RTL stream.

Yes, I appreciate that.  And I can also imagine that carrying such
option information in the AST to make this possible would be a
significant burden, and very rarely of benefit - so unless there is some
other important use-case then it is not a good trade-off.

Things like -fwrapv and -ftrapv have semantics that naturally could be
done per-insn.  Many things are not like that :-/

Indeed.

But even then, what is supposed to happen if some optimisation works on
a bunch of insns, some with -fwrapv (or -ftrapv) semantics and some not?
The only safe thing to do is to not allow any transformations on mixed
insns at all.

Sometimes mixing would be possible, sometimes not.  You can't mix "trap 
on signed integer overflow" with "wrap on signed integer overflow" and 
expect useful results.  But you /can/ mix "wrap on signed integer 
overflow" with "signed integer overflow is UB" - then you wrap.

But I can't imagine it's worth the GCC development time trying to figure 
out what could work and what could not work, and implementing this, 
unless someone is /really/ bored!  After all, this can all be done by 
hand using conversions to unsigned types, and the __builtin_overflow() 
functions when needed.

Yes, but that is only true for -ffast-math (which means "the user does
not care about correct results" anyway).

(Getting a little off-topic...

Um, that's not what "-ffast-math" means.  It means "the user is using
floating point as a close approximation to real number arithmetic, and
promises to stick to numerically stable calculations".  All my uses of
floating point are done with "-ffast-math", and I /do/ care that the
results are correct.  But the definition of "correct" for my work is "as
close to the theoretical real number result as you can get with a
limited accuracy format, plus or minus small rounding errors".

-ffast-math is allowed to introduce any rounding error it wants.  Which
can (in a loop for example) easily introduce unlimited rounding error,
bigger than the actual result.  And this is not just theoretical either.

Normal maths mode can also lead to rounding errors that can build up - 
the fact that rounding is carefully specified with IEEE does not mean 
there are no errors (compared to the theoretical perfect real-number 
calculation).  It may be easier to get problems with -ffast-math, and 
you may get them with smaller loop counts, but it is inevitable that any 
finite approximation to real numbers will lead to errors, and that some 
calculations will be numerically unstable.  IEEE means that you can do 
your testing on a fast PC and then deploy your code in a 16-bit 
microcontroller and have identical stability - but it does not mean that 
you don't get rounding errors.

Yes, there is a lot of code where this doesn't matter, in practice.  How
lucky do you feel today?

I use gcc, so I feel pretty lucky :-)

The rounding errors in -ffast-math will be very similar to those in IEEE 
mode, for normal numbers.  The operations are the same - it all 
translates to the same floating point cpu instructions, or the same 
software floating point library calls.  You don't have control of 
rounding modes, so you have to assume that rounding will be the least 
helpful of any FLT_ROUNDS setting - but you will not get worse than 
that.  This is a "quality of implementation" issue, rather than a 
specified guarantee, but compiler users rely on good quality 
implementation all the time.  After all, there are no guarantees in the 
C standards or in the gcc user manual that integer multiplication will 
be done using efficient code rather than repeated addition in a loop.

-ffast-math allows some changes to the order of calculations, or 
contracting of expressions, so you need to take that into account.  But 
then, you need to take it into account in the way you write your 
expressions in IEEE mode too, and unless you put a lot of effort into 
picking your expression ordering, the -ffast-math re-arrangements are as 
likely to improve your results (in terms of the difference compared to 
theoretical results) as they are to make them worse.

Basically, I assume that the GCC developers try to be sensible and 
helpful, and do not go out of their way to generate intentionally bad 
code for people who use one of their optimisation flags.  I assume that 
if "-ffast-math" and the associated sub-flags were as big a risk as you 
are implying, they would have been removed from gcc or at least a big 
red warning would be added to the manual.  So far, I've been lucky!

The only way to safely use -ffast-math is to inspect the generated
machine code.  After each and every compilation you do.  And everyone
who uses a different compiler version (or is on a different target,
etc.) has to do the same thing.

I do actually check the generated code for some of what I do.  I can't 
say I have ever felt the need to check generated floating point code 
because I worry about the correctness, but sometimes I do so to see if 
I've got the efficiency I expect (this is not floating point specific). 
And I also consider exact compiler versions and build flags as a part of 
my projects - bit-perfect repeatable builds are important in my work, so 
I don't change compiler versions or targets within a project without 
very good reason and a great deal of checking and re-testing.

For other people, full IEEE compliance, support for NaNs, and
bit-perfect repeatable results regardless of optimisations and target
details, are important for correctness.  And that's fine, and it's great
that gcc supports both kinds of code - though I believe that
"-ffast-math" would actually be more appropriate for a large proportion
of programs.)

Most people think that IEEE 754 was a huge step forward over wild west
floating point like we used decades ago.

Oh, sure - no doubts there.  But it has plenty of features that are of 
no use to me, in my work, and I am happy to ignore them and have gcc 
generate the best code it can while ignoring things I don't need.

David