Re: Possible gcc bug in strict type aliasing

David Brown <david@xxxxxxxxxxxxxxx> · Mon, 26 Sep 2016 13:35:00 +0200

On 26/09/16 11:32, Andrew Haley wrote:
> On 25/09/16 22:46, David Brown wrote:
> 
> I think the bug is here:
> 
>>        temp = *t2p;      // Read as T2
>>        t1p2 = (T1*)t2p;  // Visible T2 to T1 pointer conversion
>>        *t1p2 = temp;     // Write as T1
> 
> 6.3.2.3 Pointers
> 
> 7 A pointer to an object type may be converted to a pointer to a
>   different object type. If the resulting pointer is not correctly
>   aligned for the referenced type, the behavior is undefined.
>   Otherwise, when converted back again, the result shall compare equal
>   to the original pointer.
> 
> Note that you have permission only to convert the pointer back to the
> original type and compare it.  You don't have permission to
> dereference it as a different type.  IMO your program is undefined.
> 
> This is key to alias analysis: we know that a pointer to T1 can only
> point to objects compatible with T1.  It's not possible to "hide" a
> pointer to T2 from the compiler by converting it to T1, passing it to
> a function, and then converting it back to T2 and dereferencing it.

But with "*t1p2 = temp;", we are writing as a T1 through a pointer to
T1.  Then the return value is also read via a pointer to a T1 ("return
*t1p;").

It looks like gcc is simply ignoring the "*t1p2 = temp;" statement.
This may be because it knows any attempt to dereference t1p2 is
undefined (since it was created from a cast from a different pointer
type), or it may be because it knows that the effective type of *t1p2 is
actually a T2 since that's what was first stored at that address (from
"*t2p = T2VALUE").

Does that make sense?

If so, it seems like quite an aggressive optimisation, and one that may
surprise people.  Other compilers (clang, icc) treat it differently. And
the code generation for gcc here is quite fragile - changing "*t1p2 =
temp;" to "*t1p2 = temp + 1" changes the result from 100 to 201.  This
can be a true pain to debug when you have code that appears to work
correctly in your testing, but a small change somewhere leads - through
inlining and LTO - to this code later jumping silently in the output it
generates.

While I fully appreciate (and agree with) the policy of using undefined
behaviour effects to generate more efficient code, this looks like a
fairly subtle effect which silently generates unexpected code.

Perhaps it would be worth filing a request for better warnings here?
With -Wstrict-aliasing=1, gcc gives a warning "dereferencing type-punned
pointer might break strict-aliasing rules".  But the warning does not
appear with -Wstrict-aliasing=3, which is generally supposed to be more
accurate and which is the default (when -fstrict-aliasing and -Wall are
in effect).  If gcc uses strict aliasing in order to remove statements
entirely, should it not be able to give a warning with a wider variety
of warning options?

> 
> If you lie to the compiler, it will get its revenge.
> 

Yes, I know.  (I didn't write the code - it was created as an example of
code that may show questionable code generation in gcc.)

I'm okay with the compiler getting its revenge - but I would /really/
like it to tell me about it!