On 07/05/2013 01:35 AM, dw wrote: > >>> So, you are getting the duplicate zeroing of eax, but not >>> the duplicate loading of rdi. >>> >>> I'm using 4.8.0. You? >> Ah, I'm on 4.7. So this might be a regression: we should check. > > Compiling with 4.7.2 and 4.7.3, I also get no duplicate loads on rdi. > But both 4.8.0 and 4.8.1 do. All 4 duplicate the zeroing of eax. > > What now? First, I'd correct the asm: asm volatile ( "rep stosb" : "+D" (Dest), "+c" (Count) : "a" (Data) : "memory"); then see if there was a real regression. As far as I can see current GCC is (nearly) optimal: main: .LFB1: leaq -32(%rsp), %rdi movl $32, %ecx xorl %eax, %eax #APP # 2 "z.c" 1 rep stosb # 0 "" 2 #NO_APP movl $32, %ecx leaq -32(%rsp), %rdi #APP # 2 "z.c" 1 rep stosb # 0 "" 2 #NO_APP xorl %eax, %eax ret Andrew.