On Fri, 2023-05-26 at 17:40 -0700, Yonghong Song wrote: > > On 5/26/23 11:41 AM, Eduard Zingerman wrote: > > Make sure that the following unsafe example is rejected by verifier: > > > > 1: r9 = ... some pointer with range X ... > > 2: r6 = ... unbound scalar ID=a ... > > 3: r7 = ... unbound scalar ID=b ... > > 4: if (r6 > r7) goto +1 > > 5: r6 = r7 > > 6: if (r6 > X) goto ... > > --- checkpoint --- > > 7: r9 += r7 > > 8: *(u64 *)r9 = Y > > > > This example is unsafe because not all execution paths verify r7 range. > > Because of the jump at (4) the verifier would arrive at (6) in two states: > > I. r6{.id=b}, r7{.id=b} via path 1-6; > > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > > > Currently regsafe() does not call check_ids() for scalar registers, > > thus from POV of regsafe() states (I) and (II) are identical. If the > > path 1-6 is taken by verifier first, and checkpoint is created at (6) > > the path [1-4, 6] would be considered safe. > > > > This commit updates regsafe() to call check_ids() for scalar registers. > > > > The change in check_alu_op() to avoid assigning scalar id to constants > > is performance optimization. W/o it the regsafe() change becomes > > costly for some programs, e.g. for > > tools/testing/selftests/bpf/progs/pyperf600.c the difference is: > > > > File Program States (A) States (B) States (DIFF) > > --------------- -------- ---------- ---------- ---------------- > > pyperf600.bpf.o on_event 22200 37060 +14860 (+66.94%) > > > > Where A -- this patch, > > B -- this patch but w/o check_alu_op() changes. > > > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > --- > > kernel/bpf/verifier.c | 31 ++++++++++++++++++++++++++++++- > > 1 file changed, 30 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index af70dad655ab..624556eda430 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -12806,10 +12806,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > /* case: R1 = R2 > > * copy register state to dest reg > > */ > > - if (src_reg->type == SCALAR_VALUE && !src_reg->id) > > + if (src_reg->type == SCALAR_VALUE && !src_reg->id && > > + !tnum_is_const(src_reg->var_off)) > > /* Assign src and dst registers the same ID > > * that will be used by find_equal_scalars() > > * to propagate min/max range. > > + * Skip constants to avoid allocation of useless ID. > > */ > > The above is for ALU64. > > We also have ALU32 version: > } else if (src_reg->type == SCALAR_VALUE) { > bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX; > > if (is_src_reg_u32 && !src_reg->id) > src_reg->id = ++env->id_gen; > copy_register_state(dst_reg, src_reg); > ... > > Do you think we should do the same thing if src_reg is a constant, > not to change src_reg->id? This is a good point, thank you. Adding the same check for 32-bit case actually helps with the verifier performance a bit: $ ./veristat -e file,prog,states -f 'insns_pct>1' -C master-baseline.log current.log File Program States (A) States (B) States (DIFF) --------- ------------------------------ ---------- ---------- ------------- bpf_xdp.o tail_handle_nat_fwd_ipv6 648 660 +12 (+1.85%) bpf_xdp.o tail_nodeport_nat_ingress_ipv4 375 455 +80 (+21.33%) bpf_xdp.o tail_rev_nodeport_lb4 398 472 +74 (+18.59%) (all +1% - +3% cases from the cover letter are gone). > If this is added, could you have a test case for 32-bit subregister > as well? I will add the 32-bit test case. > > > src_reg->id = ++env->id_gen; > > copy_register_state(dst_reg, src_reg); > > @@ -15151,6 +15153,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > > > switch (base_type(rold->type)) { > > case SCALAR_VALUE: > > + /* Why check_ids() for precise registers? > > + * > > + * Consider the following BPF code: > > + * 1: r6 = ... unbound scalar, ID=a ... > > + * 2: r7 = ... unbound scalar, ID=b ... > > + * 3: if (r6 > r7) goto +1 > > + * 4: r6 = r7 > > + * 5: if (r6 > X) goto ... > > + * 6: ... memory operation using r7 ... > > + * > > + * First verification path is [1-6]: > > + * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; > > + * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark > > + * r7 <= X, because r6 and r7 share same id. > > + * > > + * Next verification path would start from (5), because of the jump at (3). > > + * The only state difference between first and second visits of (5) is > > + * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b). > > + * Thus, use check_ids() to distinguish these states. > > + * > > + * The `rold->precise` check is a performance optimization. If `rold->id` > > + * was ever used to access memory / predict jump, the `rold` or any > > + * register used in `rold = r?` / `r? = rold` operations would be marked > > + * as precise, otherwise it's ID is not really interesting. > > + */ > > + if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap)) > > Do we need rold->id checking in the above? check_ids should have > rold->id = 0 properly. Or this is just an optimization? You are correct, the check_ids() handles this case and it should be inlined, so there is no need to check rold->id in this 'if' branch. > regs_exact() has check_ids as well. Not sure whether it makes sense to > create a function regs_exact_scalar() just for scalar and include the > above code. Otherwise, it is strange we do check_ids in different > places. I'm not sure how to best re-organize code here, regs_exact() is a nice compartmentalized abstraction. It is possible to merge my additional check_ids() call with the main 'precise' processing part as below: @@ -15152,21 +15154,22 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, switch (base_type(rold->type)) { case SCALAR_VALUE: if (regs_exact(rold, rcur, idmap)) return true; if (env->explore_alu_limits) return false; if (!rold->precise) return true; /* new val must satisfy old val knowledge */ return range_within(rold, rcur) && - tnum_in(rold->var_off, rcur->var_off); + tnum_in(rold->var_off, rcur->var_off) && + check_ids(rold->id, rcur->id, idmap); I'd say that extending /* new val must satisfy ... */ comment to explain why check_ids() is needed should be sufficient, but I'm open for suggestions. > > > + return false; > > if (regs_exact(rold, rcur, idmap)) > > return true; > > if (env->explore_alu_limits)