Sorry, just forget my code :-( On Fri, Apr 15, 2011 at 06:45:32PM +0200, Axel Freyn wrote: > Hi Zeev, > On Fri, Apr 15, 2011 at 07:24:42PM +0300, Zeev Tarantov wrote: > > When compiling this code: > > > > unsigned int get_le32(unsigned char *p) > > { > > return p[0] | p[1] << 8 | p[2] << 16 | p[3] << 24; > > } > > > > On gcc 4.6.0 rev. 172266 for x86-64, I get: > > > > movzbl 1(%rdi), %eax > > movzbl 2(%rdi), %edx > > sall $8, %eax > > sall $16, %edx > > orl %edx, %eax > > movzbl (%rdi), %edx > > orl %edx, %eax > > movzbl 3(%rdi), %edx > > sall $24, %edx > > orl %edx, %eax > > ret > > > > I hoped for much better code. I hoped to avoid ifdef's depending on > > endianess, but this means I can't. > > Am I missing something obvious that precludes the compiler from > > optimizing the expression? > > This is not a regression and other compilers didn't do any better, so > > I hope I'm just missing something. > I don't know to to solve your problem exactly, however rewriting your > code solves it (I think my code does the same as yours...): > unsigned int get_le32(unsigned char *p) > { > unsigned char p1[4]; > for(int i = 0; i < 3; ++i) > p1[i] = p[i] << i*8; > return p1[0] | p1[1] | p1[2] | p1[3]; > } > gets compiled into two lines: > movzbl (%rdi), %eax > ret > > (this is both gcc 4.7 from the trunk and gcc 4.4.5, also x86-64) > > Axel