Alexei Starovoitov wrote: > On Thu, Jan 30, 2020 at 03:34:27PM -0800, John Fastabend wrote: > > > > diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td > > index 0f39294..a187103 100644 > > --- a/llvm/lib/Target/BPF/BPFInstrInfo.td > > +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td > > @@ -733,7 +733,7 @@ def : Pat<(i64 (sext GPR32:$src)), > > (SRA_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>; > > > > def : Pat<(i64 (zext GPR32:$src)), > > - (SRL_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>; > > + (MOV_32_64 GPR32:$src)>; > > That's good optimization and 32-bit zero-extend after mov32 is not x86 > specific. It's mandatory on all archs. > > But I won't solve this problem. There are both signed and unsigned extensions > in that program. The one that breaks is _singed_ one and it cannot be optimized > into any other instruction by llvm. > Hence the proposal to do pseudo insn for it and upgrade to uapi later. Those are both coming from the llvm ir zext call with the above patch there 56 and 57 are ommitted so there are no shifts. I'll check again just to be sure and put the details in a patch for the backend. > > llvm-objdump -S test_get_stack_rawtp.o > ; usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); > 52: 85 00 00 00 43 00 00 00 call 67 > 53: b7 02 00 00 00 00 00 00 r2 = 0 > 54: bc 08 00 00 00 00 00 00 w8 = w0 > ; if (usize < 0) > 55: bc 81 00 00 00 00 00 00 w1 = w8 > 56: 67 01 00 00 20 00 00 00 r1 <<= 32 > 57: c7 01 00 00 20 00 00 00 r1 s>>= 32 > 58: 6d 12 1a 00 00 00 00 00 if r2 s> r1 goto +26 <LBB0_6> > 56 and 57 are the shifts. Agree it doesn't make much sense that the r1 s>>= 32 is signed to me at the moment. I'll take a look in the morning. That fragment 55,56, 57 are coming from a zext in llvm though. FWIW once the shifts are removed the next issue is coerce loses info on the smax that it needs. Something like this is needed so that if we have a tight smax_value we don't lose it to the mask. @@ -2805,9 +2804,32 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size) reg->umax_value = mask; } reg->smin_value = reg->umin_value; - reg->smax_value = reg->umax_value; + if (reg->smax_value < 0 || reg->smax_value > reg->umax_value) + reg->smax_value = reg->umax_value; +} + I'll write up the details in a patch once we iron out the LLVM zext IR signed shift. .John