Ok, I got it, thank you Marc. On 31 October 2015 at 21:21, Marc Glisse <marc.glisse@xxxxxxxx> wrote: > On Sat, 31 Oct 2015, Bruno Loff wrote: > >> I am always impressed by the power of the GCC optimizer. Today I found >> a somewhat surprising abnormality when using compound-expressions. >> Look at the two definitions for the function f(a) = a*a + a: >> >> >> int64_t f1( int64_t a ) { >> return a * a + a; >> } >> >> int64_t f2( int64_t a ) { >> return ({ >> int64_t b; >> b = a * a; >> ({ >> int64_t c; >> c = b + a; >> c; >> }); >> }); >> } >> >> I expected that GCC would either make a mess with the second >> definition, or would smartly produce the same code for both >> definitions. I was wrong. Here is the (simplified) x86-64 output of >> with -O3: >> >> f1: >> leaq 1(%rdi), %rax >> imulq %rdi, %rax >> ret >> >> f2: >> movq %rdi, %rax >> imulq %rdi, %rax >> addq %rdi, %rax >> ret >> >> >> >> The code for f2 is what I expected, but if I was a little smarter (and >> knew more asm) I might have instead expected f1. The code for f1 >> basically does >> >> b := a + 1 >> b := b * a >> >> Whereas the code for f2 does: >> >> b := a >> b := b * a >> b := b >> >> The code for f1 is clearly better, saving on one instruction. They >> are, of course, completely equivalent. > > > It isn't that obvious to me which version is better, but I agree that both > should generate the same code. > >> So why is GCC failing to optimize the compound expressions all the >> way? My guess would be that it has to do with the order in which some >> optimization passes are happening. Anyone? > > > A number of optimizations happen, for historical reasons, during parsing, > when the front-end calls functions from fold-const.c on expressions. We are > currently moving many such optimizations to a later stage (using match.pd), > if this transformation is moved, it will also apply to f2. > > -fdump-tree-all can give you a lot of information about the various stages > of optimization. > > -- > Marc Glisse