Alexei Starovoitov wrote: > On Fri, Jan 31, 2020 at 9:16 AM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > Also don't mind to build pseudo instruction here for signed extension > > but its not clear to me why we are getting different instruction > > selections? Its not clear to me why sext is being chosen in your case? > > Sign extension has to be there if jmp64 is used. > So the difference is due to -mcpu=v2 vs -mcpu=v3 > v2 does alu32, but not jmp32 > v3 does both. > By default selftests are using -mcpu=probe which > detects v2/v3 depending on running kernel. > > llc -mattr=dwarfris -march=bpf -mcpu=v3 -mattr=+alu32 > ; usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); > 48: bf 61 00 00 00 00 00 00 r1 = r6 > 49: bf 72 00 00 00 00 00 00 r2 = r7 > 50: b4 03 00 00 20 03 00 00 w3 = 800 > 51: b7 04 00 00 00 01 00 00 r4 = 256 > 52: 85 00 00 00 43 00 00 00 call 67 > 53: bc 08 00 00 00 00 00 00 w8 = w0 > ; if (usize < 0) > 54: c6 08 16 00 00 00 00 00 if w8 s< 0 goto +22 <LBB0_6> > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); > 55: 1c 89 00 00 00 00 00 00 w9 -= w8 > 56: bc 81 00 00 00 00 00 00 w1 = w8 > 57: 67 01 00 00 20 00 00 00 r1 <<= 32 > 58: 77 01 00 00 20 00 00 00 r1 >>= 32 > 59: bf 72 00 00 00 00 00 00 r2 = r7 > 60: 0f 12 00 00 00 00 00 00 r2 += r1 > 61: bf 61 00 00 00 00 00 00 r1 = r6 > 62: bc 93 00 00 00 00 00 00 w3 = w9 > 63: b7 04 00 00 00 00 00 00 r4 = 0 > 64: 85 00 00 00 43 00 00 00 call 67 > ; if (ksize < 0) > 65: c6 00 0b 00 00 00 00 00 if w0 s< 0 goto +11 <LBB0_6> > > llc -mattr=dwarfris -march=bpf -mcpu=v2 -mattr=+alu32 > ; usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); > 48: bf 61 00 00 00 00 00 00 r1 = r6 > 49: bf 72 00 00 00 00 00 00 r2 = r7 > 50: b4 03 00 00 20 03 00 00 w3 = 800 > 51: b7 04 00 00 00 01 00 00 r4 = 256 > 52: 85 00 00 00 43 00 00 00 call 67 > 53: bc 08 00 00 00 00 00 00 w8 = w0 > ; if (usize < 0) > 54: bc 81 00 00 00 00 00 00 w1 = w8 > 55: 67 01 00 00 20 00 00 00 r1 <<= 32 > 56: c7 01 00 00 20 00 00 00 r1 s>>= 32 > 57: c5 01 19 00 00 00 00 00 if r1 s< 0 goto +25 <LBB0_6> > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); > 58: 1c 89 00 00 00 00 00 00 w9 -= w8 > 59: bc 81 00 00 00 00 00 00 w1 = w8 > 60: 67 01 00 00 20 00 00 00 r1 <<= 32 > 61: 77 01 00 00 20 00 00 00 r1 >>= 32 > 62: bf 72 00 00 00 00 00 00 r2 = r7 > 63: 0f 12 00 00 00 00 00 00 r2 += r1 > 64: bf 61 00 00 00 00 00 00 r1 = r6 > 65: bc 93 00 00 00 00 00 00 w3 = w9 > 66: b7 04 00 00 00 00 00 00 r4 = 0 > 67: 85 00 00 00 43 00 00 00 call 67 > ; if (ksize < 0) > 68: bc 01 00 00 00 00 00 00 w1 = w0 > 69: 67 01 00 00 20 00 00 00 r1 <<= 32 > 70: c7 01 00 00 20 00 00 00 r1 s>>= 32 > 71: c5 01 0b 00 00 00 00 00 if r1 s< 0 goto +11 <LBB0_6> > > zext is there both cases and it will be optimized with your llvm patch. > So please send it. Don't delay :) LLVM patch here, https://reviews.llvm.org/D73985 With updated LLVM I can pass selftests with above fix and additional patch below to get tighter bounds on 32bit registers. So going forward I think we need to review and assuming it looks good commit above llvm patch and then go forward with this series. --- bpf: coerce reg use tighter max bound if possible When we do a coerce_reg_to_size we lose possibly valid upper bounds in the case where, (a) smax is non-negative and (b) smax is less than max value in new reg size. If both (a) and (b) are satisfied we can keep the smax bound. (a) is required to ensure we do not remove upper sign bit. And (b) is required to ensure previously set bits are contained inside the new reg bits. Signed-off-by: John Fastabend <john.fastabend@xxxxxxxxx> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1cc945d..e5349d6 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2805,7 +2805,8 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size) reg->umax_value = mask; } reg->smin_value = reg->umin_value; - reg->smax_value = reg->umax_value; + if (reg->smax_value < 0 || reg->smax_value > reg->umax_value) + reg->smax_value = reg->umax_value; } static bool bpf_map_is_rdonly(const struct bpf_map *map)