On 05/01/2024 16:00, Segher Boessenkool wrote:
On Fri, Jan 05, 2024 at 03:24:48PM +0100, David Brown via Gcc-help wrote:
On 04/01/2024 17:55, Segher Boessenkool wrote:
Most things work on function basis; almost nothing works per RTL
instruction. There is no per-instruction representation for -fwrapv
in the RTL stream.
Yes, I appreciate that. And I can also imagine that carrying such
option information in the AST to make this possible would be a
significant burden, and very rarely of benefit - so unless there is some
other important use-case then it is not a good trade-off.
Things like -fwrapv and -ftrapv have semantics that naturally could be
done per-insn. Many things are not like that :-/
Indeed.
But even then, what is supposed to happen if some optimisation works on
a bunch of insns, some with -fwrapv (or -ftrapv) semantics and some not?
The only safe thing to do is to not allow any transformations on mixed
insns at all.
Sometimes mixing would be possible, sometimes not. You can't mix "trap
on signed integer overflow" with "wrap on signed integer overflow" and
expect useful results. But you /can/ mix "wrap on signed integer
overflow" with "signed integer overflow is UB" - then you wrap.
But I can't imagine it's worth the GCC development time trying to figure
out what could work and what could not work, and implementing this,
unless someone is /really/ bored! After all, this can all be done by
hand using conversions to unsigned types, and the __builtin_overflow()
functions when needed.
Yes, but that is only true for -ffast-math (which means "the user does
not care about correct results" anyway).
(Getting a little off-topic...
Um, that's not what "-ffast-math" means. It means "the user is using
floating point as a close approximation to real number arithmetic, and
promises to stick to numerically stable calculations". All my uses of
floating point are done with "-ffast-math", and I /do/ care that the
results are correct. But the definition of "correct" for my work is "as
close to the theoretical real number result as you can get with a
limited accuracy format, plus or minus small rounding errors".
-ffast-math is allowed to introduce any rounding error it wants. Which
can (in a loop for example) easily introduce unlimited rounding error,
bigger than the actual result. And this is not just theoretical either.
Normal maths mode can also lead to rounding errors that can build up -
the fact that rounding is carefully specified with IEEE does not mean
there are no errors (compared to the theoretical perfect real-number
calculation). It may be easier to get problems with -ffast-math, and
you may get them with smaller loop counts, but it is inevitable that any
finite approximation to real numbers will lead to errors, and that some
calculations will be numerically unstable. IEEE means that you can do
your testing on a fast PC and then deploy your code in a 16-bit
microcontroller and have identical stability - but it does not mean that
you don't get rounding errors.
Yes, there is a lot of code where this doesn't matter, in practice. How
lucky do you feel today?
I use gcc, so I feel pretty lucky :-)
The rounding errors in -ffast-math will be very similar to those in IEEE
mode, for normal numbers. The operations are the same - it all
translates to the same floating point cpu instructions, or the same
software floating point library calls. You don't have control of
rounding modes, so you have to assume that rounding will be the least
helpful of any FLT_ROUNDS setting - but you will not get worse than
that. This is a "quality of implementation" issue, rather than a
specified guarantee, but compiler users rely on good quality
implementation all the time. After all, there are no guarantees in the
C standards or in the gcc user manual that integer multiplication will
be done using efficient code rather than repeated addition in a loop.
-ffast-math allows some changes to the order of calculations, or
contracting of expressions, so you need to take that into account. But
then, you need to take it into account in the way you write your
expressions in IEEE mode too, and unless you put a lot of effort into
picking your expression ordering, the -ffast-math re-arrangements are as
likely to improve your results (in terms of the difference compared to
theoretical results) as they are to make them worse.
Basically, I assume that the GCC developers try to be sensible and
helpful, and do not go out of their way to generate intentionally bad
code for people who use one of their optimisation flags. I assume that
if "-ffast-math" and the associated sub-flags were as big a risk as you
are implying, they would have been removed from gcc or at least a big
red warning would be added to the manual. So far, I've been lucky!
The only way to safely use -ffast-math is to inspect the generated
machine code. After each and every compilation you do. And everyone
who uses a different compiler version (or is on a different target,
etc.) has to do the same thing.
I do actually check the generated code for some of what I do. I can't
say I have ever felt the need to check generated floating point code
because I worry about the correctness, but sometimes I do so to see if
I've got the efficiency I expect (this is not floating point specific).
And I also consider exact compiler versions and build flags as a part of
my projects - bit-perfect repeatable builds are important in my work, so
I don't change compiler versions or targets within a project without
very good reason and a great deal of checking and re-testing.
For other people, full IEEE compliance, support for NaNs, and
bit-perfect repeatable results regardless of optimisations and target
details, are important for correctness. And that's fine, and it's great
that gcc supports both kinds of code - though I believe that
"-ffast-math" would actually be more appropriate for a large proportion
of programs.)
Most people think that IEEE 754 was a huge step forward over wild west
floating point like we used decades ago.
Oh, sure - no doubts there. But it has plenty of features that are of
no use to me, in my work, and I am happy to ignore them and have gcc
generate the best code it can while ignoring things I don't need.
David