On 11/20/19 8:59 PM, Alexei Starovoitov wrote: > On Wed, Nov 20, 2019 at 05:40:24PM -0800, Yonghong Song wrote: >> With latest llvm (trunk https://github.com/llvm/llvm-project), >> test_progs, which has +alu32 enabled, failed for strobemeta.o. >> The verifier output looks like below with edit to replace large >> decimal numbers with hex ones. >> 193: (85) call bpf_probe_read_user_str#114 >> R0=inv(id=0) >> 194: (26) if w0 > 0x1 goto pc+4 >> R0_w=inv(id=0,umax_value=0xffffffff00000001) >> 195: (6b) *(u16 *)(r7 +80) = r0 >> 196: (bc) w6 = w0 >> R6_w=inv(id=0,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) >> 197: (67) r6 <<= 32 >> R6_w=inv(id=0,smax_value=0x7fffffff00000000,umax_value=0xffffffff00000000, >> var_off=(0x0; 0xffffffff00000000)) >> 198: (77) r6 >>= 32 >> R6=inv(id=0,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) >> ... >> 201: (79) r8 = *(u64 *)(r10 -416) >> R8_w=map_value(id=0,off=40,ks=4,vs=13872,imm=0) >> 202: (0f) r8 += r6 >> R8_w=map_value(id=0,off=40,ks=4,vs=13872,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) >> 203: (07) r8 += 9696 >> R8_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) >> ... >> 255: (bf) r1 = r8 >> R1_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=0xffffffff,var_off=(0x0; 0xffffffff)) >> ... >> 257: (85) call bpf_probe_read_user_str#114 >> R1 unbounded memory access, make sure to bounds check any array access into a map >> >> The value range for register r6 at insn 198 should be really just 0/1. >> The umax_value=0xffffffff caused later verification failure. >> >> After jmp instructions, the current verifier already tried to use just >> obtained information to get better register range. The current mechanism is >> for 64bit register only. This patch implemented to tighten the range >> for 32bit sub-registers after jmp32 instructions. >> With the patch, we have the below range ranges for the >> above code sequence: >> 193: (85) call bpf_probe_read_user_str#114 >> R0=inv(id=0) >> 194: (26) if w0 > 0x1 goto pc+4 >> R0_w=inv(id=0,smax_value=0x7fffffff00000001,umax_value=0xffffffff00000001, >> var_off=(0x0; 0xffffffff00000001)) >> 195: (6b) *(u16 *)(r7 +80) = r0 >> 196: (bc) w6 = w0 >> R6_w=inv(id=0,umax_value=0xffffffff,var_off=(0x0; 0x1)) >> 197: (67) r6 <<= 32 >> R6_w=inv(id=0,umax_value=0x100000000,var_off=(0x0; 0x100000000)) >> 198: (77) r6 >>= 32 >> R6=inv(id=0,umax_value=1,var_off=(0x0; 0x1)) >> ... >> 201: (79) r8 = *(u64 *)(r10 -416) >> R8_w=map_value(id=0,off=40,ks=4,vs=13872,imm=0) >> 202: (0f) r8 += r6 >> R8_w=map_value(id=0,off=40,ks=4,vs=13872,umax_value=1,var_off=(0x0; 0x1)) >> 203: (07) r8 += 9696 >> R8_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=1,var_off=(0x0; 0x1)) >> ... >> 255: (bf) r1 = r8 >> R1_w=map_value(id=0,off=9736,ks=4,vs=13872,umax_value=1,var_off=(0x0; 0x1)) >> ... >> 257: (85) call bpf_probe_read_user_str#114 >> ... >> >> At insn 194, the register R0 has better var_off.mask and smax_value. >> Especially, the var_off.mask ensures later lshift and rshift >> maintains proper value range. >> >> Suggested-by: Alexei Starovoitov <ast@xxxxxxxxxx> >> Signed-off-by: Yonghong Song <yhs@xxxxxx> >> --- >> kernel/bpf/verifier.c | 32 ++++++++++++++++++++++++++++---- >> 1 file changed, 28 insertions(+), 4 deletions(-) >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 9f59f7a19dd0..0090654c9010 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -1007,6 +1007,20 @@ static void __reg_bound_offset(struct bpf_reg_state *reg) >> reg->umax_value)); >> } >> >> +static void __reg_bound_offset32(struct bpf_reg_state *reg) >> +{ >> + u64 mask = 0xffffFFFF; >> + struct tnum range = tnum_range(reg->umin_value & mask, >> + reg->umax_value & mask); >> + struct tnum lo32 = tnum_cast(reg->var_off, 4); >> + struct tnum hi32 = reg->var_off; >> + >> + hi32.value &= ~mask; >> + hi32.mask &= ~mask; > > sorry that was a quick hack :) > May be make sense to do it as a helper? similar to tnum_cast ? > The idea was to apply tnum_range to lower 32-bits. > May be tnum_intersect_with_mask(a, b, 4) ? > Or above two lines could be > hi32 = tnum_and(reg->var_off, tnum_bitwise_not(tnum_u32_max)); > There is tnum_lshift/tnum_rshift. They could be used as well. I will use tnum_lshift/tnum_rshift which seems easy to understand. Will send v2 soon. > > Ed, > how would you simplify __reg_bound_offset32 logic ? > >> + >> + reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range)); >> +} >> + >> /* Reset the min/max bounds of a register */ >> static void __mark_reg_unbounded(struct bpf_reg_state *reg) >> { >> @@ -5587,8 +5601,13 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg, >> __reg_deduce_bounds(false_reg); >> __reg_deduce_bounds(true_reg); >> /* We might have learned some bits from the bounds. */ >> - __reg_bound_offset(false_reg); >> - __reg_bound_offset(true_reg); >> + if (is_jmp32) { >> + __reg_bound_offset32(false_reg); >> + __reg_bound_offset32(true_reg); >> + } else { >> + __reg_bound_offset(false_reg); >> + __reg_bound_offset(true_reg); >> + } >> /* Intersecting with the old var_off might have improved our bounds >> * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc), >> * then new var_off is (0; 0x7f...fc) which improves our umax. >> @@ -5696,8 +5715,13 @@ static void reg_set_min_max_inv(struct bpf_reg_state *true_reg, >> __reg_deduce_bounds(false_reg); >> __reg_deduce_bounds(true_reg); >> /* We might have learned some bits from the bounds. */ >> - __reg_bound_offset(false_reg); >> - __reg_bound_offset(true_reg); >> + if (is_jmp32) { >> + __reg_bound_offset32(false_reg); >> + __reg_bound_offset32(true_reg); >> + } else { >> + __reg_bound_offset(false_reg); >> + __reg_bound_offset(true_reg); >> + } >> /* Intersecting with the old var_off might have improved our bounds >> * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc), >> * then new var_off is (0; 0x7f...fc) which improves our umax. >> -- >> 2.17.1 >>