When do uint8_t casts generate better code than & 0xFF, and why?

Ulf Magnusson <ulfalizer@xxxxxxxxx> · Sun, 28 Jul 2013 11:05:23 +0200

Hi,

I've noticed that casting to uint8_t often yields slightly more
compact code compared to masking with 0xFF, when the result should be
equivalent. To give a pointless example,

#include <stdint.h>

unsigned u;
uint8_t u8;

void f() {
    u = (u8 << 3) & 0xFF;
}

void g() {
    u = (uint8_t)(u8 << 3);
}

becomes

00000000 <f()>:
   0:    0f b6 05 00 00 00 00     movzbl 0x0,%eax
   7:    c1 e0 03                 shl    $0x3,%eax
   a:    25 ff 00 00 00           and    $0xff,%eax
   f:    a3 00 00 00 00           mov    %eax,0x0
  14:    c3                       ret
  15:    8d 74 26 00              lea    0x0(%esi,%eiz,1),%esi
  19:    8d bc 27 00 00 00 00     lea    0x0(%edi,%eiz,1),%edi

00000020 <g()>:
  20:    0f b6 05 00 00 00 00     movzbl 0x0,%eax
  27:    c1 e0 03                 shl    $0x3,%eax
  2a:    0f b6 c0                 movzbl %al,%eax
  2d:    a3 00 00 00 00           mov    %eax,0x0
  32:    c3                       ret

Is there a good reason why GCC picks the first version for the
bitmask, or is it just a failure of optimization? When might the cast
generate worse code?

/Ulf