There has been some discussions going on in the comp.lang.c newsgroup
about how far compilers are allowed to go regarding optimisation using
their knowledge of undefined behaviour (i.e., if a piece of code has
undefined behaviour, the compiler can assume that the user does not care
about the result in that case, and can therefore do whatever it wants in
order to generate faster code).
(Sorry about the length of this question - it is a difficult topic.)
I would like to know where the gcc developers stand on this - how can
the rules of the standards be interpreted? And how /should/ they be
interpreted? Does gcc aim for the most efficient code, or for the
principle of least astonishment?
For example, consider this requirement:
Write a function "int foo(int x)" that will call "bar(x)" if x is less
than 50000. Then if x is less than 1000, return x*x*x - otherwise
return any integer. (Assume 32-bit int's, no traps or other "funny
stuff" on the target.)
It seems reasonable to write:
int foo(int x) {
if (x < 50000) bar(x);
return x*x*x;
}
For x greater than 1290, the cubing is going to overflow and generate a
mathematically incorrect answer. But that's okay - we don't care about
the result in such cases.
The question is, can the compiler reason like this:
If x > 1290, then undefined behaviour will occur. The user clearly
doesn't care what happens when x > 1290. Therefore, we can assume that
x <= 1290. Therefore, x < 50000. Therefore, there is no need to check
the conditional, and bar(x) can be called directly.
The point here is that if the code hits undefined behaviour, then the
behaviour of the entire program is undefined. Thus the undefined
behaviour in "x*x*x" can work backwards.
Equally clearly, this would be against the expectations of the
programmer. And any attempt to avoid the undefined behaviour (such as
changing the last line to "return (x < 1000) ? x*x*x : 0;") would lead
to bigger and slower code than the "obvious" compilation of foo() above.
(In tests, gcc and clang both keep the check on x < 50000.
<https://gcc.godbolt.org/> is really handy!)
How is the programmer to be sure that the compiler does not apply this
optimisation logic? Or is the compiler not allowed to do it?
Obviously the programmer should strive to avoid undefined behaviour, but
sometimes doing so means less efficient code than an "obvious"
compilation would. (One way round that is if it were possible to write
an expression that you know returns an int without undefined behaviour,
but don't care which one - allowing the compiler freedom in its
optimisations. Then you could write the final line of foo() as "return
(x < 1000) ? x * x * x : dont_care();", and the compiler could generate
optimal code.
Or consider this code:
char arr[8192];
int foo2(int i) {
if (i < 0) return -1;
i *= 3;
if (i < 0) return -2;
if (i >= 8192) return -3;
return arr[i]++;
}
The compiler can assume that since integer overflow is undefined
behaviour, and it knows that i >= 0 after the first check, then the
result of "i *= 3" must be non-negative. Therefore it can omit the
check for i < 0 on the next line. (In tests, gcc and clang both do this
when optimisation is enabled.)
However, if you pass an integer over (2^31 / 3) to the function, it is
likely that after the assembly code for i *= 3 is run, the value in the
cpu register will be negative and the obvious assembly for the "i >=
8192" test will not trigger. The result is access (and modification) to
memory well outside the bounds of arr, despite the clear safety check in
the code. The compiler would be allowed to do this even if it follows
the Analyzability Annex L in C11 standards - and it would be turning a
"bounded undefined behaviour" (integer overflow) into "unbounded
undefined behaviour" (modification of memory outside of bounds).
In this second example, the skipped tests come /after/ the undefined
behaviour, but it could still easily be seen as unexpected behaviour
from the compiler.