On Sat, 2023-05-27 at 15:21 +0300, Eduard Zingerman wrote: [...] > > > @@ -15151,6 +15153,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > > > > > switch (base_type(rold->type)) { > > > case SCALAR_VALUE: > > > + /* Why check_ids() for precise registers? > > > + * > > > + * Consider the following BPF code: > > > + * 1: r6 = ... unbound scalar, ID=a ... > > > + * 2: r7 = ... unbound scalar, ID=b ... > > > + * 3: if (r6 > r7) goto +1 > > > + * 4: r6 = r7 > > > + * 5: if (r6 > X) goto ... > > > + * 6: ... memory operation using r7 ... > > > + * > > > + * First verification path is [1-6]: > > > + * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; > > > + * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark > > > + * r7 <= X, because r6 and r7 share same id. > > > + * > > > + * Next verification path would start from (5), because of the jump at (3). > > > + * The only state difference between first and second visits of (5) is > > > + * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b). > > > + * Thus, use check_ids() to distinguish these states. > > > + * > > > + * The `rold->precise` check is a performance optimization. If `rold->id` > > > + * was ever used to access memory / predict jump, the `rold` or any > > > + * register used in `rold = r?` / `r? = rold` operations would be marked > > > + * as precise, otherwise it's ID is not really interesting. > > > + */ > > > + if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap)) > > > > Do we need rold->id checking in the above? check_ids should have > > rold->id = 0 properly. Or this is just an optimization? > > You are correct, the check_ids() handles this case and it should be inlined, > so there is no need to check rold->id in this 'if' branch. > > > regs_exact() has check_ids as well. Not sure whether it makes sense to > > create a function regs_exact_scalar() just for scalar and include the > > above code. Otherwise, it is strange we do check_ids in different > > places. > > I'm not sure how to best re-organize code here, regs_exact() is a nice > compartmentalized abstraction. It is possible to merge my additional > check_ids() call with the main 'precise' processing part as below: > > @@ -15152,21 +15154,22 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > switch (base_type(rold->type)) { > case SCALAR_VALUE: > if (regs_exact(rold, rcur, idmap)) > return true; > if (env->explore_alu_limits) > return false; > if (!rold->precise) > return true; > /* new val must satisfy old val knowledge */ > return range_within(rold, rcur) && > - tnum_in(rold->var_off, rcur->var_off); > + tnum_in(rold->var_off, rcur->var_off) && > + check_ids(rold->id, rcur->id, idmap); > > I'd say that extending /* new val must satisfy ... */ comment to > explain why check_ids() is needed should be sufficient, but I'm open > for suggestions. On the other hand, I wanted to have a separate 'if' branch like: if (rold->precise && !check_ids(rold->id, rcur->id, idmap)) Specifically to explain that 'rold->precise' part is an optimization. > > > > > > + return false; > > > if (regs_exact(rold, rcur, idmap)) > > > return true; > > > if (env->explore_alu_limits) >