Re: AW: optimizer discards sign information

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/04/2024 11:24, stefan@xxxxxxxxx wrote:
-----Ursprüngliche Nachricht-----
Von: Alexander Monakov <amonakov@xxxxxxxxx>
Gesendet: Mittwoch, 10. April 2024 11:17
An: stefan@xxxxxxxxx
Cc: gcc-help@xxxxxxxxxxx
Betreff: Re: optimizer discards sign information


On Wed, 10 Apr 2024, stefan@xxxxxxxxx wrote:

Hi all,

I just stumbled over an issue, which is present in almost all gcc versions.
I worked around using inline assembly… Maybe gcc behaves correct and I
am wrong? Here is the code:

https://godbolt.org/z/cW8jcdh56

typedef unsigned long long int u64;
typedef unsigned int u32;
typedef unsigned short u16;

u64 foo(u16 a, u16 b) {
     u32 x = a * b;
     u64 r = x;
     return r;
}

And on gcc 13.2 x86.64 you get

foo:
         movzx   esi, si
         movzx   edi, di
         imul    edi, esi
         movsx   rax, edi
         ret


There is a sign extension! The optimizer step discards the information

	 x_6 = (u32) _3;

and uses _3 directly instead, which is signed.

Am I wrong or is it gcc?

GCC is not wrong. When your code computes x:

     u32 x = a * b;

'a' and 'b' are first promoted to int according to C language rules, and the
multiplication happens in the signed int type, with UB on overflow.
The compiler deduces the range of signed int temporary holding the result of
the multiplication is [0, 0x7fffffff], which allows to propagate it to the
assignment of 'r' (which in the end produces a sign extension, as you
observed, so the propagation did not turn out to be useful).

u16 * u16 is a famous footgun for sure. I'd suggest 'x = 1u * a * b'
as a fix for the code.

Alexander

Thank you for the fix 😊
I considered

	u32 x = a * b;

as good enough, since from my understanding, x *is* unsigned.

Adding the multiplication with 1u resolves this.


From the wording you use, I think perhaps you have (or had) two misunderstandings about the way C works here. First, when you have an expression "x = y", the type of "x" is irrelevant to the evaluation of the expression "y", its value, and the validity of the value.

Secondly, if you hit undefined behaviour somewhere, /everything/ afterwards is undefined behaviour. You cannot "correct" it, or force it in some ways to establish properties about it. And the compiler can assume it does not happen (or you don't care about anything that might happen). This lets it optimise based on these assumptions.

So the fact that "x" is declared as an unsigned type is basically irrelevant. The semantics of the code "u32 x = a * b;", for the type sizes of x86-64, are that the compiler must take the u16 value in "a" and convert it into a 32-bit "int" preserving its value. It does the same with "b". It knows these are two 32-bit signed ints with values between 0 and 0xffff.

Then it is asked to multiply them. It knows the result does not overflow - because overflow in the operation would be UB. Thus it knows the result is between 0 and 0x7fff'ffff.

This is then converted to a u32. The compiler can do this any way it likes, as long as the value is conserved for all possible values (0 to 0x7fff'ffff). Typically it will be a no-op.

Then this is converted to a u64, and again the compiler can do so in any manner that is value conserving for all possible values. Since it knows the value is in the range 0 to 0x7fff'ffff, then either a sign-extension from 32-bit to 64-bit, or a zero extension, will work fine. (It could also explicitly set the upper 32-bit to 0, or bitwise-and the register with 0x0000'0000'7fff'ffff, or shift it 33 bits left then 33 bits right, or any other operation that gives the same result in the end. That's a question of optimisation and efficiency, not correctness.)

If it were to matter whether the compiler used sign extension or zero extension, you would have already have executed undefined behaviour - and all bets are off. Thus it /doesn't/ matter which is used - the compiler can use either method, whichever it thinks is most efficient. (And if it picked the less efficient one, that's a missed optimisation bug, not a code correctness bug.)

And if you try to rely on the effects of UB in your code, then your code has a bug. Different compilers and different options may give you different end results - undefined behaviour means /anything/ can happen, including the thing you wanted to happen!


This particular case is a bit subtle, and it's easy to get caught out - after all, it's using unsigned types everywhere, and the common myth is that unsigned types don't have undefined behaviour. The reality is that it is /operations/ that have defined or undefined behaviour, not the types, and there are no arithmetic operations on types smaller than "int". "uint8_t" and "uint16_t" silently promote to "int" (on 32-bit or 64-bit systems), and operations on /int/ can have undefined behaviour.


I hope that clears up any remaining misunderstandings - and I hope it didn't sound patronising about things that you already knew.

David






[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux