Re: AW: optimizer discards sign information

David Brown via Gcc-help <gcc-help@xxxxxxxxxxx> · Wed, 10 Apr 2024 13:52:39 +0200

On 10/04/2024 11:24, stefan@xxxxxxxxx wrote:
-----Ursprüngliche Nachricht-----
Von: Alexander Monakov <amonakov@xxxxxxxxx>
Gesendet: Mittwoch, 10. April 2024 11:17
An: stefan@xxxxxxxxx
Cc: gcc-help@xxxxxxxxxxx
Betreff: Re: optimizer discards sign information

On Wed, 10 Apr 2024, stefan@xxxxxxxxx wrote:

Hi all,

I just stumbled over an issue, which is present in almost all gcc versions.
I worked around using inline assembly… Maybe gcc behaves correct and I
am wrong? Here is the code:

https://godbolt.org/z/cW8jcdh56

typedef unsigned long long int u64;
typedef unsigned int u32;
typedef unsigned short u16;

u64 foo(u16 a, u16 b) {
     u32 x = a * b;
     u64 r = x;
     return r;
}

And on gcc 13.2 x86.64 you get

foo:
         movzx   esi, si
         movzx   edi, di
         imul    edi, esi
         movsx   rax, edi
         ret

There is a sign extension! The optimizer step discards the information

	 x_6 = (u32) _3;

and uses _3 directly instead, which is signed.

Am I wrong or is it gcc?

GCC is not wrong. When your code computes x:

     u32 x = a * b;

'a' and 'b' are first promoted to int according to C language rules, and the
multiplication happens in the signed int type, with UB on overflow.
The compiler deduces the range of signed int temporary holding the result of
the multiplication is [0, 0x7fffffff], which allows to propagate it to the
assignment of 'r' (which in the end produces a sign extension, as you
observed, so the propagation did not turn out to be useful).

u16 * u16 is a famous footgun for sure. I'd suggest 'x = 1u * a * b'
as a fix for the code.

Alexander

Thank you for the fix 😊
I considered

	u32 x = a * b;

as good enough, since from my understanding, x *is* unsigned.

Adding the multiplication with 1u resolves this.

From the wording you use, I think perhaps you have (or had) two 
misunderstandings about the way C works here.  First, when you have an 
expression "x = y", the type of "x" is irrelevant to the evaluation of 
the expression "y", its value, and the validity of the value.

Secondly, if you hit undefined behaviour somewhere, /everything/ 
afterwards is undefined behaviour.  You cannot "correct" it, or force it 
in some ways to establish properties about it.  And the compiler can 
assume it does not happen (or you don't care about anything that might 
happen).  This lets it optimise based on these assumptions.

So the fact that "x" is declared as an unsigned type is basically 
irrelevant.  The semantics of the code "u32 x = a * b;", for the type 
sizes of x86-64, are that the compiler must take the u16 value in "a" 
and convert it into a 32-bit "int" preserving its value.  It does the 
same with "b".  It knows these are two 32-bit signed ints with values 
between 0 and 0xffff.

Then it is asked to multiply them.  It knows the result does not 
overflow - because overflow in the operation would be UB.  Thus it knows 
the result is between 0 and 0x7fff'ffff.

This is then converted to a u32.  The compiler can do this any way it 
likes, as long as the value is conserved for all possible values (0 to 
0x7fff'ffff).  Typically it will be a no-op.

Then this is converted to a u64, and again the compiler can do so in any 
manner that is value conserving for all possible values.  Since it knows 
the value is in the range 0 to 0x7fff'ffff, then either a sign-extension 
from 32-bit to 64-bit, or a zero extension, will work fine.  (It could 
also explicitly set the upper 32-bit to 0, or bitwise-and the register 
with 0x0000'0000'7fff'ffff, or shift it 33 bits left then 33 bits right, 
or any other operation that gives the same result in the end.  That's a 
question of optimisation and efficiency, not correctness.)

If it were to matter whether the compiler used sign extension or zero 
extension, you would have already have executed undefined behaviour - 
and all bets are off.  Thus it /doesn't/ matter which is used - the 
compiler can use either method, whichever it thinks is most efficient. 
(And if it picked the less efficient one, that's a missed optimisation 
bug, not a code correctness bug.)

And if you try to rely on the effects of UB in your code, then your code 
has a bug.  Different compilers and different options may give you 
different end results - undefined behaviour means /anything/ can happen, 
including the thing you wanted to happen!

This particular case is a bit subtle, and it's easy to get caught out - 
after all, it's using unsigned types everywhere, and the common myth is 
that unsigned types don't have undefined behaviour.  The reality is that 
it is /operations/ that have defined or undefined behaviour, not the 
types, and there are no arithmetic operations on types smaller than 
"int".  "uint8_t" and "uint16_t" silently promote to "int" (on 32-bit or 
64-bit systems), and operations on /int/ can have undefined behaviour.

I hope that clears up any remaining misunderstandings - and I hope it 
didn't sound patronising about things that you already knew.

David