On Tue, 2023-05-30 at 14:37 -0700, Andrii Nakryiko wrote: > On Tue, May 30, 2023 at 10:27 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > Make sure that the following unsafe example is rejected by verifier: > > > > 1: r9 = ... some pointer with range X ... > > 2: r6 = ... unbound scalar ID=a ... > > 3: r7 = ... unbound scalar ID=b ... > > 4: if (r6 > r7) goto +1 > > 5: r6 = r7 > > 6: if (r6 > X) goto ... > > --- checkpoint --- > > 7: r9 += r7 > > 8: *(u64 *)r9 = Y > > > > This example is unsafe because not all execution paths verify r7 range. > > Because of the jump at (4) the verifier would arrive at (6) in two states: > > I. r6{.id=b}, r7{.id=b} via path 1-6; > > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > > > Currently regsafe() does not call check_ids() for scalar registers, > > thus from POV of regsafe() states (I) and (II) are identical. If the > > path 1-6 is taken by verifier first, and checkpoint is created at (6) > > the path [1-4, 6] would be considered safe. > > > > This commit updates regsafe() to call check_ids() for scalar registers. > > > > This change is costly in terms of verification performance. > > Using veristat to compare number of processed states for selftests > > object files listed in tools/testing/selftests/bpf/veristat.cfg and > > Cilium object files from [1] gives the following statistics: > > > > Filter | Number of programs > > ---------------------------------- > > states_pct>10 | 40 > > states_pct>20 | 20 > > states_pct>30 | 15 > > states_pct>40 | 11 > > > > (Out of total 177 programs) > > > > In fact, impact is so bad that in no-alu32 mode the following > > test_progs tests no longer pass verifiction: > > - verif_scale2: maximal number of instructions exceeded > > - verif_scale3: maximal number of instructions exceeded > > - verif_scale_pyperf600: maximal number of instructions exceeded > > > > Additionally: > > - verifier_search_pruning/allocated_stack: expected prunning does not > > happen because of differences in register id allocation. > > > > Followup patch would address these issues. > > > > [1] git@xxxxxxxxxx:anakryiko/cilium.git > > > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > --- > > kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index af70dad655ab..9c10f2619c4f 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -15151,6 +15151,28 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > > > switch (base_type(rold->type)) { > > case SCALAR_VALUE: > > + /* Why check_ids() for scalar registers? > > + * > > + * Consider the following BPF code: > > + * 1: r6 = ... unbound scalar, ID=a ... > > + * 2: r7 = ... unbound scalar, ID=b ... > > + * 3: if (r6 > r7) goto +1 > > + * 4: r6 = r7 > > + * 5: if (r6 > X) goto ... > > + * 6: ... memory operation using r7 ... > > + * > > + * First verification path is [1-6]: > > + * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; > > + * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark > > + * r7 <= X, because r6 and r7 share same id. > > + * > > + * Next verification path would start from (5), because of the jump at (3). > > + * The only state difference between first and second visits of (5) is > > + * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b). > > + * Thus, use check_ids() to distinguish these states. > > + */ > > + if (!check_ids(rold->id, rcur->id, idmap)) > > + return false; > > does this check_ids() have to be performed before regs_exact (which > also checks IDs, btw) *and* before rold->precise check? Relative position to regs_exact() does not matter (because it does check_ids). Relative position to rold->precise *does* matter (see next answer). > Intuitively, it feels like `rold->precise = false` case shouldn't care > about IDs in rcur, as any value should be safe. But I haven't spent > much time thinking about this, so there might be some corner case I'm > missing. I tried to explain this in the cover letter, I'll try to add more details below. Effectively what you suggest is to modify the check as follows: if (rold->precise && !check_ids(rold->id, rcur->id, idmap)) return false; Unfortunately, not all registers with interesting IDs would be marked as precise. Consider the original example but with r6 and r7 swapped: 1: r9 = ... some pointer with range X ... 2: r6 = ... unbound scalar ID=a ... 3: r7 = ... unbound scalar ID=b ... 4: if (r6 > r7) goto +1 5: r7 = r6 6: if (r7 > X) goto ... 7: r9 += r6 Suppose that current verification path is 1-7: - On a way down 1-6 r7 will not be marked as precise, because condition (r7 > X) is not predictable (see check_cond_jmp_op()); - When (7) is reached mark_chain_precision() will start moving up marking the following registers as precise: 4: if (r6 > r7) goto +1 ; r6, r7 5: r7 = r6 ; r6 6: if (r7 > X) goto ... ; r6 7: r9 += r6 ; r6 - Thus, if checkpoint is created for (6) r7 would be marked as read, but will not be marked as precise. Next, suppose that jump from 4 to 6 is verified and checkpoint for (6) is considered: - r6 is not precise, so check_ids() is not called for it and it is not added to idmap; - r7 is precise, so check_ids() is called for it, but it is a sole register in the idmap; - States are considered equal. Here is the log (I added a few prints for states cache comparison): from 10 to 13: safe steq hit 10, cur: R0=scalar(id=2) R6=scalar(id=2) R7=scalar(id=1) R9=fp-8 R10=fp0 fp-8=00000000 steq hit 10, old: R6_rD=Pscalar(id=2) R7_rwD=scalar(id=2) R9_rD=fp-8 R10=fp0 fp-8_rD=00000000 The verifier_scalar_ids.c:ids_id_mapping_in_regsafe_2() is another example. (test names are so-so...). I'll recite myself: Ideally check_ids() should only be called for 'rold' that either: (a) gained range via find_equal_scalars() in some child verification path and was then marked as precise; (b) was a source of range information for some other register via find_equal_scalars() in some child verification path, and that register was then marked as precise. Current implementation of mark_chain_precision() does not guarantee precision mark for point (b). This leads to a few questions: - Q: Should (b) be guaranteed? A: I don't know. If patch #1 from this patch-set is applied, I cannot figure out any counter-example showing that (b) is necessary. Corollary: (b) is a performance optimization for patch #1. - Q: How hard is it to implement (b)? A: Information about register id assignments for instructions in the middle of a state is lost. I see a few ways to mitigate this: - Extend bpf_verifier_state::jmp_history to track a mask for registers / stack slots that gained range via find_equal_scalars() and use this mask in backtrack_insn(). - Make it so, that every conditional jump instruction is the last instruction in a state. Then register ID assignments should actually be valid, and backtrack_insn() could search for registers with the same ID when marking precise registers. - Q: If (b) is merely an optimization, what is simpler (b) or patch #3 (u32_hashset thing)? A: I think that patch #3 is logically simpler. So, it boils down to a question should (b) be guaranteed? What do you think? > I wonder if moving this check after we handled imprecise rold case would help. > > Also, it might make sense to drop SCALAR register IDs as soon as we > have only one instance of it left (e.g., if "paired" register was > overwritten already). I.e., aggressively drop IDs when they become > useless. WDYT? I'll try, but don't expect a big change, will report a bit later. > The u32_hashset.... Have you also measured verification time > regression with this? I wonder how much impact that change has? Summing up duration_diff field for all object files average is 0.5% verification time increase. > > > if (regs_exact(rold, rcur, idmap)) > > return true; > > if (env->explore_alu_limits) > > -- > > 2.40.1 > >