Hi, Consider code: int foo(char *t, char *v, int w) { int i; for (i = 1; i != w; ++i) { int x = i << 2; v[x + 4] = t[x + 4]; } return 0; } Compile it to x86 (I used both gcc 4.7.2 and gcc 4.8.1) with options: gcc -O2 -m32 -S test.c You will see loop, formed like: .L5: leal 0(,%eax,4), %edx addl $1, %eax movzbl 4(%edi,%edx), %ecx cmpl %ebx, %eax movb %cl, 4(%esi,%edx) jne .L5 But it can be easily simplified to something like this: .L5: addl $1, %eax movzbl (%esi,%eax,4), %edx cmpl %ecx, %eax movb %dl, (%ebx,%eax,4) jne .L5 (i.e. left shift may be moved to address). First question to gcc-help maillist. May be there are some options, that I've missed, and there IS a way to explain gcc my intention to do this? And second question to gcc developers mail list. I am working on private backend and want to add this optimization to my backend. What do you advise me to do -- custom gimple pass, or rtl pass, or modify some existent pass, etc? --- With best regards, Konstantin