Re: missed optimization compiling naive get_unaligned_le32 on x86

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry, just forget my code :-(

On Fri, Apr 15, 2011 at 06:45:32PM +0200, Axel Freyn wrote:
> Hi Zeev,
> On Fri, Apr 15, 2011 at 07:24:42PM +0300, Zeev Tarantov wrote:
> > When compiling this code:
> > 
> > unsigned int get_le32(unsigned char *p)
> > {
> >   return p[0] | p[1] << 8 | p[2] << 16 | p[3] << 24;
> > }
> > 
> > On gcc 4.6.0 rev. 172266 for x86-64, I get:
> > 
> >         movzbl  1(%rdi), %eax
> >         movzbl  2(%rdi), %edx
> >         sall    $8, %eax
> >         sall    $16, %edx
> >         orl     %edx, %eax
> >         movzbl  (%rdi), %edx
> >         orl     %edx, %eax
> >         movzbl  3(%rdi), %edx
> >         sall    $24, %edx
> >         orl     %edx, %eax
> >         ret
> > 
> > I hoped for much better code. I hoped to avoid ifdef's depending on
> > endianess, but this means I can't.
> > Am I missing something obvious that precludes the compiler from
> > optimizing the expression?
> > This is not a regression and other compilers didn't do any better, so
> > I hope I'm just missing something.
> I don't know to to solve your problem exactly, however rewriting your
> code solves it (I think my code does the same as yours...):
> unsigned int get_le32(unsigned char *p)
> {
>   unsigned char p1[4];
>   for(int i = 0; i < 3; ++i)
>     p1[i] = p[i] << i*8;
>   return p1[0] | p1[1] | p1[2] | p1[3];
> }
> gets compiled into two lines:
> 	movzbl	(%rdi), %eax
> 	ret
> 
> (this is both gcc 4.7 from the trunk and gcc 4.4.5, also x86-64)
> 
> Axel


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux