Optimisations and undefined behaviour

David Brown <david.brown@xxxxxxxxxxxx> · Thu, 05 Nov 2015 21:52:32 +0100

There has been some discussions going on in the comp.lang.c newsgroup 
about how far compilers are allowed to go regarding optimisation using 
their knowledge of undefined behaviour (i.e., if a piece of code has 
undefined behaviour, the compiler can assume that the user does not care 
about the result in that case, and can therefore do whatever it wants in 
order to generate faster code).

(Sorry about the length of this question - it is a difficult topic.)

I would like to know where the gcc developers stand on this - how can 
the rules of the standards be interpreted?  And how /should/ they be 
interpreted?  Does gcc aim for the most efficient code, or for the 
principle of least astonishment?

For example, consider this requirement:

Write a function "int foo(int x)" that will call "bar(x)" if x is less 
than 50000.  Then if x is less than 1000, return x*x*x - otherwise 
return any integer.  (Assume 32-bit int's, no traps or other "funny 
stuff" on the target.)

It seems reasonable to write:

	int foo(int x) {
		if (x < 50000) bar(x);
		return x*x*x;
	}

For x greater than 1290, the cubing is going to overflow and generate a 
mathematically incorrect answer.  But that's okay - we don't care about 
the result in such cases.

The question is, can the compiler reason like this:

If x > 1290, then undefined behaviour will occur.  The user clearly 
doesn't care what happens when x > 1290.  Therefore, we can assume that 
x <= 1290.  Therefore, x < 50000.  Therefore, there is no need to check 
the conditional, and bar(x) can be called directly.

The point here is that if the code hits undefined behaviour, then the 
behaviour of the entire program is undefined.  Thus the undefined 
behaviour in "x*x*x" can work backwards.

Equally clearly, this would be against the expectations of the 
programmer.  And any attempt to avoid the undefined behaviour (such as 
changing the last line to "return (x < 1000) ? x*x*x : 0;") would lead 
to bigger and slower code than the "obvious" compilation of foo() above.

(In tests, gcc and clang both keep the check on x < 50000. 
<https://gcc.godbolt.org/> is really handy!)

How is the programmer to be sure that the compiler does not apply this 
optimisation logic?  Or is the compiler not allowed to do it?

Obviously the programmer should strive to avoid undefined behaviour, but 
sometimes doing so means less efficient code than an "obvious" 
compilation would.  (One way round that is if it were possible to write 
an expression that you know returns an int without undefined behaviour, 
but don't care which one - allowing the compiler freedom in its 
optimisations.  Then you could write the final line of foo() as "return 
(x < 1000) ? x * x * x : dont_care();", and the compiler could generate 
optimal code.

Or consider this code:

	char arr[8192];
	int foo2(int i) {
		if (i < 0) return -1;
		i *= 3;
		if (i < 0) return -2;
		if (i >= 8192) return -3;
		return arr[i]++;
	}

The compiler can assume that since integer overflow is undefined 
behaviour, and it knows that i >= 0 after the first check, then the 
result of "i *= 3" must be non-negative.  Therefore it can omit the 
check for i < 0 on the next line.  (In tests, gcc and clang both do this 
when optimisation is enabled.)

However, if you pass an integer over (2^31 / 3) to the function, it is 
likely that after the assembly code for i *= 3 is run, the value in the 
cpu register will be negative and the obvious assembly for the "i >= 
8192" test will not trigger.  The result is access (and modification) to 
memory well outside the bounds of arr, despite the clear safety check in 
the code.  The compiler would be allowed to do this even if it follows 
the Analyzability Annex L in C11 standards - and it would be turning a 
"bounded undefined behaviour" (integer overflow) into "unbounded 
undefined behaviour" (modification of memory outside of bounds).

In this second example, the skipped tests come /after/ the undefined 
behaviour, but it could still easily be seen as unexpected behaviour 
from the compiler.