Hi,
I have been involved in a discussion in the comp.lang.c newsgroup,
concerning a possible case of gcc being a bit too aggressive in
optimising based on strict type alias analysis. In the code below, gcc
gives the result 100 when optimising -O2 or above on 64-bit targets (in
which "int64_t" is a typedef for "long"), but 200 for lower
optimisations, or with -fno-strict-aliasing, or on 32-bit targets (where
"int64_t" is a typedef for "long long"). Other compilers such as clang
and icc consistently give 200.
On x86-64 (linux target), the functions blah3 and test3 are compiled
(-O2) to:
blah3:
movq $100, (%rdi)
movl $100, %eax
movq $200, (%rsi)
ret
test3:
movl $100, %eax
ret
With -O1 (or -O2 -fno-strict-aliasing), it compiles to:
blah3:
movq $100, (%rdi)
movq $200, (%rsi)
movq (%rdi), %rax
ret
test3:
movl $200, %eax
ret
gcc 4.7 onwards have this same code. gcc 4.5 does less optimisation,
and always returns 200. gcc 4.6 is interesting - blah3 returns 100,
while test3 returns 200, giving a mixture of the two.
The question is, is gcc's -O2 optimisation valid here? Does the fact
that the pointers come from dynamic memory (and therefore the buffer's
effective type is only decided when written) affect that?
An interesting effect is that if the line "*t1p2 = temp;" is changed to
"*t1p2 = temp + 1;", the compiled code is the same at -O1 and -O2, and
returns 201 as the final result.
I have been testing this using the online compiler at gcc.godbolt.org,
as this lets me easily pick different compiler versions and different
settings, and view the resulting assembly code:
<https://gcc.godbolt.org/#compilers:!((compiler:g62,options:'-O2+-Wall+-Wextra+-x+c+-fstrict-aliasing',sourcez:FAAjGIEsDsGMBsCuATApiAPAZwC7JjgHQAWAfKGCFHEmprgE4wDmJ5lVMCK62e8kAEZsKELrV658AexGiQOAJ4AHVGgBmIeNOjMtOvQBUAjAG55S1RpAEAbABYA%2BjhCGATOY7gNMdCYBiALKGIABEAKTw8Mih8t6o6r6uxgBqAIIAMgCqAKIgxgAMBXE%2B0H5u6dl5bkXA8iYggvAAhsQAzAAUAG7SkMggAFTKxgA0ID19g8puAJTyAN7yYA0DOMbKY6vrHkuuboM4bsqeHMkKqAC2x3WnCusgALwgwycch8qPzzu3Wx9PJpVcqZKAB6EEgADqTBw6GaWGSu1WR0%2B7kBOWBYDBkOhsPh7l2MKunyRx1uWIASqhmv04XsCdtPh0TAMZu8MViUpAsEJ4OUFNIzspetAYQwQLAdF1UAxuTpEWtpp9CaSOFioZAYSBaSZdgxUDhEAxoAd1hjVeDKdStXjjPIAL43SgNGG4ToTZBzDiLW7uqafC7NKLSWAdbkAL1Q0k0TOMMxAAGoQOHI9H3DMZq8ncYQHqsIh4C4nk1Wp0NsoM7t1HrUB1y5mwHqDUac6g8wXMw6OAQQAGYN1eh6FrsGrn84Xzq6OhXbsomCL1B1QoZiOhRwWbPDQskgoYxmucNPTo3DcaCmbKPagAA%3D%3D)),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
typedef long long T1;
typedef int64_t T2;
#define T1FMT "%lld"
#define T1VALUE 100
#define T2VALUE 200
T1 blah3(void *p1, void *p2)
{
T1 *t1p, *t1p2;
T2 *t2p;
T1 temp;
t1p = p1;
t2p = p2;
*t1p = T1VALUE; // Write as T1
*t2p = T2VALUE; // Write as T2
temp = *t2p; // Read as T2
t1p2 = (T1*)t2p; // Visible T2 to T1 pointer conversion
*t1p2 = temp; // Write as T1
return *t1p; // Read as T1
}
T1 test3(void)
{
void *p = malloc(sizeof (T1) + sizeof (T2));
T1 result = blah3(p,p);
free(p);
return result;
}
int main(void)
{
T1 result = test3();
printf("The result is " T1FMT, result);
return 0;
}