Re: pragma GCC optimize prevents inlining

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/01/2024 16:00, Segher Boessenkool wrote:
On Fri, Jan 05, 2024 at 03:24:48PM +0100, David Brown via Gcc-help wrote:
On 04/01/2024 17:55, Segher Boessenkool wrote:
Most things work on function basis; almost nothing works per RTL
instruction.  There is no per-instruction representation for -fwrapv
in the RTL stream.

Yes, I appreciate that.  And I can also imagine that carrying such
option information in the AST to make this possible would be a
significant burden, and very rarely of benefit - so unless there is some
other important use-case then it is not a good trade-off.

Things like -fwrapv and -ftrapv have semantics that naturally could be
done per-insn.  Many things are not like that :-/

Indeed.


But even then, what is supposed to happen if some optimisation works on
a bunch of insns, some with -fwrapv (or -ftrapv) semantics and some not?
The only safe thing to do is to not allow any transformations on mixed
insns at all.

Sometimes mixing would be possible, sometimes not. You can't mix "trap on signed integer overflow" with "wrap on signed integer overflow" and expect useful results. But you /can/ mix "wrap on signed integer overflow" with "signed integer overflow is UB" - then you wrap.

But I can't imagine it's worth the GCC development time trying to figure out what could work and what could not work, and implementing this, unless someone is /really/ bored! After all, this can all be done by hand using conversions to unsigned types, and the __builtin_overflow() functions when needed.


Yes, but that is only true for -ffast-math (which means "the user does
not care about correct results" anyway).

(Getting a little off-topic...

Um, that's not what "-ffast-math" means.  It means "the user is using
floating point as a close approximation to real number arithmetic, and
promises to stick to numerically stable calculations".  All my uses of
floating point are done with "-ffast-math", and I /do/ care that the
results are correct.  But the definition of "correct" for my work is "as
close to the theoretical real number result as you can get with a
limited accuracy format, plus or minus small rounding errors".

-ffast-math is allowed to introduce any rounding error it wants.  Which
can (in a loop for example) easily introduce unlimited rounding error,
bigger than the actual result.  And this is not just theoretical either.


Normal maths mode can also lead to rounding errors that can build up - the fact that rounding is carefully specified with IEEE does not mean there are no errors (compared to the theoretical perfect real-number calculation). It may be easier to get problems with -ffast-math, and you may get them with smaller loop counts, but it is inevitable that any finite approximation to real numbers will lead to errors, and that some calculations will be numerically unstable. IEEE means that you can do your testing on a fast PC and then deploy your code in a 16-bit microcontroller and have identical stability - but it does not mean that you don't get rounding errors.

Yes, there is a lot of code where this doesn't matter, in practice.  How
lucky do you feel today?

I use gcc, so I feel pretty lucky :-)

The rounding errors in -ffast-math will be very similar to those in IEEE mode, for normal numbers. The operations are the same - it all translates to the same floating point cpu instructions, or the same software floating point library calls. You don't have control of rounding modes, so you have to assume that rounding will be the least helpful of any FLT_ROUNDS setting - but you will not get worse than that. This is a "quality of implementation" issue, rather than a specified guarantee, but compiler users rely on good quality implementation all the time. After all, there are no guarantees in the C standards or in the gcc user manual that integer multiplication will be done using efficient code rather than repeated addition in a loop.

-ffast-math allows some changes to the order of calculations, or contracting of expressions, so you need to take that into account. But then, you need to take it into account in the way you write your expressions in IEEE mode too, and unless you put a lot of effort into picking your expression ordering, the -ffast-math re-arrangements are as likely to improve your results (in terms of the difference compared to theoretical results) as they are to make them worse.

Basically, I assume that the GCC developers try to be sensible and helpful, and do not go out of their way to generate intentionally bad code for people who use one of their optimisation flags. I assume that if "-ffast-math" and the associated sub-flags were as big a risk as you are implying, they would have been removed from gcc or at least a big red warning would be added to the manual. So far, I've been lucky!


The only way to safely use -ffast-math is to inspect the generated
machine code.  After each and every compilation you do.  And everyone
who uses a different compiler version (or is on a different target,
etc.) has to do the same thing.


I do actually check the generated code for some of what I do. I can't say I have ever felt the need to check generated floating point code because I worry about the correctness, but sometimes I do so to see if I've got the efficiency I expect (this is not floating point specific). And I also consider exact compiler versions and build flags as a part of my projects - bit-perfect repeatable builds are important in my work, so I don't change compiler versions or targets within a project without very good reason and a great deal of checking and re-testing.

For other people, full IEEE compliance, support for NaNs, and
bit-perfect repeatable results regardless of optimisations and target
details, are important for correctness.  And that's fine, and it's great
that gcc supports both kinds of code - though I believe that
"-ffast-math" would actually be more appropriate for a large proportion
of programs.)

Most people think that IEEE 754 was a huge step forward over wild west
floating point like we used decades ago.


Oh, sure - no doubts there. But it has plenty of features that are of no use to me, in my work, and I am happy to ignore them and have gcc generate the best code it can while ignoring things I don't need.

David





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux