On Sat, 31 Oct 2015, Bruno Loff wrote:
I am always impressed by the power of the GCC optimizer. Today I found a somewhat surprising abnormality when using compound-expressions. Look at the two definitions for the function f(a) = a*a + a: int64_t f1( int64_t a ) { return a * a + a; } int64_t f2( int64_t a ) { return ({ int64_t b; b = a * a; ({ int64_t c; c = b + a; c; }); }); } I expected that GCC would either make a mess with the second definition, or would smartly produce the same code for both definitions. I was wrong. Here is the (simplified) x86-64 output of with -O3: f1: leaq 1(%rdi), %rax imulq %rdi, %rax ret f2: movq %rdi, %rax imulq %rdi, %rax addq %rdi, %rax ret The code for f2 is what I expected, but if I was a little smarter (and knew more asm) I might have instead expected f1. The code for f1 basically does b := a + 1 b := b * a Whereas the code for f2 does: b := a b := b * a b := b The code for f1 is clearly better, saving on one instruction. They are, of course, completely equivalent.
It isn't that obvious to me which version is better, but I agree that both should generate the same code.
So why is GCC failing to optimize the compound expressions all the way? My guess would be that it has to do with the order in which some optimization passes are happening. Anyone?
A number of optimizations happen, for historical reasons, during parsing, when the front-end calls functions from fold-const.c on expressions. We are currently moving many such optimizations to a later stage (using match.pd), if this transformation is moved, it will also apply to f2.
-fdump-tree-all can give you a lot of information about the various stages of optimization.
-- Marc Glisse