On Sat, 2023-10-21 at 21:16 -0700, Andrii Nakryiko wrote: > On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > Convergence for open coded iterators is computed in is_state_visited() > > by examining states with branches count > 1 and using states_equal(). > > states_equal() computes sub-state relation using read and precision marks. > > Read and precision marks are propagated from children states, > > thus are not guaranteed to be complete inside a loop when branches > > count > 1. This could be demonstrated using the following unsafe program: > > > > 1. r7 = -16 > > 2. r6 = bpf_get_prandom_u32() > > 3. while (bpf_iter_num_next(&fp[-8])) { > > 4. if (r6 != 42) { > > 5. r7 = -32 > > 6. r6 = bpf_get_prandom_u32() > > 7. continue > > 8. } > > 9. r0 = r10 > > 10. r0 += r7 > > 11. r8 = *(u64 *)(r0 + 0) > > 12. r6 = bpf_get_prandom_u32() > > 13. } > > > > Here verifier would first visit path 1-3, create a checkpoint at 3 > > with r7=-16, continue to 4-7,3 with r7=-32. > > > > Because instructions at 9-12 had not been visitied yet existing > > checkpoint at 3 does not have read or precision mark for r7. > > Thus states_equal() would return true and verifier would discard > > current state, thus unsafe memory access at 11 would not be caught. > > > > This commit fixes this loophole by introducing exact state comparisons > > for iterator convergence logic: > > - registers are compared using regs_exact() regardless of read or > > precision marks; > > - stack slots have to have identical type. > > > > Unfortunately, this is too strict even for simple programs like below: > > > > i = 0; > > while(iter_next(&it)) > > i++; > > > > At each iteration step i++ would produce a new distinct state and > > eventually instruction processing limit would be reached. > > > > To avoid such behavior speculatively forget (widen) range for > > imprecise scalar registers, if those registers were not precise at the > > end of the previous iteration and do not match exactly. > > > > This a conservative heuristic that allows to verify wide range of > > programs, however it precludes verification of programs that conjure > > an imprecise value on the first loop iteration and use it as precise > > on the second. > > > > Test case iter_task_vma_for_each() presents one of such cases: > > > > unsigned int seen = 0; > > ... > > bpf_for_each(task_vma, vma, task, 0) { > > if (seen >= 1000) > > break; > > ... > > seen++; > > } > > > > Here clang generates the following code: > > > > <LBB0_4>: > > 24: r8 = r6 ; stash current value of > > ... body ... 'seen' > > 29: r1 = r10 > > 30: r1 += -0x8 > > 31: call bpf_iter_task_vma_next > > 32: r6 += 0x1 ; seen++; > > 33: if r0 == 0x0 goto +0x2 <LBB0_6> ; exit on next() == NULL > > 34: r7 += 0x10 > > 35: if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000 > > > > <LBB0_6>: > > ... exit ... > > > > Note that counter in r6 is copied to r8 and then incremented, > > conditional jump is done using r8. Because of this precision mark for > > r6 lags one state behind of precision mark on r8 and widening logic > > kicks in. > > > > Adding barrier_var(seen) after conditional is sufficient to force > > clang use the same register for both counting and conditional jump. > > > > This issue was discussed in the thread [1] which was started by > > Andrew Werner <awerner32@xxxxxxxxx> demonstrating a similar bug > > in callback functions handling. The callbacks would be addressed > > in a followup patch. > > > > [1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@xxxxxxxxx/ > > > > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> > > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> > > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > --- > > include/linux/bpf_verifier.h | 1 + > > kernel/bpf/verifier.c | 212 +++++++++++++++--- > > .../selftests/bpf/progs/iters_task_vma.c | 1 + > > 3 files changed, 184 insertions(+), 30 deletions(-) > > > > [...] > > > +static int widen_imprecise_scalars(struct bpf_verifier_env *env, > > + struct bpf_verifier_state *old, > > + struct bpf_verifier_state *cur) > > +{ > > + struct bpf_func_state *fold, *fcur; > > + int i, fr; > > + > > + reset_idmap_scratch(env); > > + for (fr = old->curframe; fr >= 0; fr--) { > > + fold = old->frame[fr]; > > + fcur = cur->frame[fr]; > > + > > + for (i = 0; i < MAX_BPF_REG; i++) > > + maybe_widen_reg(env, > > + &fold->regs[i], > > + &fcur->regs[i], > > + &env->idmap_scratch); > > + > > + for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) { > > + if (is_spilled_scalar_reg(&fold->stack[i])) > > should this be !is_spilled_scalar_reg()? > > but also your should make sure that fcur->stack[i] is also a spilled > reg (it could be iter or dynptr, or just hard-coded ZERO or MISC stack > slot, so accessing fcur->stack[i].spilled_ptr below might be invalid) Yeap, this is a bug, thank you. I'll add a test case and submit V3 today. > > > + continue; > > > + > > + maybe_widen_reg(env, > > + &fold->stack[i].spilled_ptr, > > + &fcur->stack[i].spilled_ptr, > > + &env->idmap_scratch); > > + } > > + } > > + return 0; > > +} > > + > > [...]