Hi Zeev, On Fri, Apr 15, 2011 at 07:24:42PM +0300, Zeev Tarantov wrote: > When compiling this code: > > unsigned int get_le32(unsigned char *p) > { > return p[0] | p[1] << 8 | p[2] << 16 | p[3] << 24; > } > > On gcc 4.6.0 rev. 172266 for x86-64, I get: > > movzbl 1(%rdi), %eax > movzbl 2(%rdi), %edx > sall $8, %eax > sall $16, %edx > orl %edx, %eax > movzbl (%rdi), %edx > orl %edx, %eax > movzbl 3(%rdi), %edx > sall $24, %edx > orl %edx, %eax > ret > > I hoped for much better code. I hoped to avoid ifdef's depending on > endianess, but this means I can't. > Am I missing something obvious that precludes the compiler from > optimizing the expression? > This is not a regression and other compilers didn't do any better, so > I hope I'm just missing something. I don't know to to solve your problem exactly, however rewriting your code solves it (I think my code does the same as yours...): unsigned int get_le32(unsigned char *p) { unsigned char p1[4]; for(int i = 0; i < 3; ++i) p1[i] = p[i] << i*8; return p1[0] | p1[1] | p1[2] | p1[3]; } gets compiled into two lines: movzbl (%rdi), %eax ret (this is both gcc 4.7 from the trunk and gcc 4.4.5, also x86-64) Axel