On Thu, Jan 30, 2020 at 03:34:27PM -0800, John Fastabend wrote: > > diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td > index 0f39294..a187103 100644 > --- a/llvm/lib/Target/BPF/BPFInstrInfo.td > +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td > @@ -733,7 +733,7 @@ def : Pat<(i64 (sext GPR32:$src)), > (SRA_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>; > > def : Pat<(i64 (zext GPR32:$src)), > - (SRL_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>; > + (MOV_32_64 GPR32:$src)>; That's good optimization and 32-bit zero-extend after mov32 is not x86 specific. It's mandatory on all archs. But I won't solve this problem. There are both signed and unsigned extensions in that program. The one that breaks is _singed_ one and it cannot be optimized into any other instruction by llvm. Hence the proposal to do pseudo insn for it and upgrade to uapi later. llvm-objdump -S test_get_stack_rawtp.o ; usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); 52: 85 00 00 00 43 00 00 00 call 67 53: b7 02 00 00 00 00 00 00 r2 = 0 54: bc 08 00 00 00 00 00 00 w8 = w0 ; if (usize < 0) 55: bc 81 00 00 00 00 00 00 w1 = w8 56: 67 01 00 00 20 00 00 00 r1 <<= 32 57: c7 01 00 00 20 00 00 00 r1 s>>= 32 58: 6d 12 1a 00 00 00 00 00 if r2 s> r1 goto +26 <LBB0_6> 56 and 57 are the shifts.