Re: [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks

Eduard Zingerman <eddyz87@xxxxxxxxx> · Mon, 23 Oct 2023 16:38:56 +0300

On Sat, 2023-10-21 at 21:16 -0700, Andrii Nakryiko wrote:
> On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > 
> > Convergence for open coded iterators is computed in is_state_visited()
> > by examining states with branches count > 1 and using states_equal().
> > states_equal() computes sub-state relation using read and precision marks.
> > Read and precision marks are propagated from children states,
> > thus are not guaranteed to be complete inside a loop when branches
> > count > 1. This could be demonstrated using the following unsafe program:
> > 
> >      1. r7 = -16
> >      2. r6 = bpf_get_prandom_u32()
> >      3. while (bpf_iter_num_next(&fp[-8])) {
> >      4.   if (r6 != 42) {
> >      5.     r7 = -32
> >      6.     r6 = bpf_get_prandom_u32()
> >      7.     continue
> >      8.   }
> >      9.   r0 = r10
> >     10.   r0 += r7
> >     11.   r8 = *(u64 *)(r0 + 0)
> >     12.   r6 = bpf_get_prandom_u32()
> >     13. }
> > 
> > Here verifier would first visit path 1-3, create a checkpoint at 3
> > with r7=-16, continue to 4-7,3 with r7=-32.
> > 
> > Because instructions at 9-12 had not been visitied yet existing
> > checkpoint at 3 does not have read or precision mark for r7.
> > Thus states_equal() would return true and verifier would discard
> > current state, thus unsafe memory access at 11 would not be caught.
> > 
> > This commit fixes this loophole by introducing exact state comparisons
> > for iterator convergence logic:
> > - registers are compared using regs_exact() regardless of read or
> >   precision marks;
> > - stack slots have to have identical type.
> > 
> > Unfortunately, this is too strict even for simple programs like below:
> > 
> >     i = 0;
> >     while(iter_next(&it))
> >       i++;
> > 
> > At each iteration step i++ would produce a new distinct state and
> > eventually instruction processing limit would be reached.
> > 
> > To avoid such behavior speculatively forget (widen) range for
> > imprecise scalar registers, if those registers were not precise at the
> > end of the previous iteration and do not match exactly.
> > 
> > This a conservative heuristic that allows to verify wide range of
> > programs, however it precludes verification of programs that conjure
> > an imprecise value on the first loop iteration and use it as precise
> > on the second.
> > 
> > Test case iter_task_vma_for_each() presents one of such cases:
> > 
> >         unsigned int seen = 0;
> >         ...
> >         bpf_for_each(task_vma, vma, task, 0) {
> >                 if (seen >= 1000)
> >                         break;
> >                 ...
> >                 seen++;
> >         }
> > 
> > Here clang generates the following code:
> > 
> > <LBB0_4>:
> >       24:       r8 = r6                          ; stash current value of
> >                 ... body ...                       'seen'
> >       29:       r1 = r10
> >       30:       r1 += -0x8
> >       31:       call bpf_iter_task_vma_next
> >       32:       r6 += 0x1                        ; seen++;
> >       33:       if r0 == 0x0 goto +0x2 <LBB0_6>  ; exit on next() == NULL
> >       34:       r7 += 0x10
> >       35:       if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000
> > 
> > <LBB0_6>:
> >       ... exit ...
> > 
> > Note that counter in r6 is copied to r8 and then incremented,
> > conditional jump is done using r8. Because of this precision mark for
> > r6 lags one state behind of precision mark on r8 and widening logic
> > kicks in.
> > 
> > Adding barrier_var(seen) after conditional is sufficient to force
> > clang use the same register for both counting and conditional jump.
> > 
> > This issue was discussed in the thread [1] which was started by
> > Andrew Werner <awerner32@xxxxxxxxx> demonstrating a similar bug
> > in callback functions handling. The callbacks would be addressed
> > in a followup patch.
> > 
> > [1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@xxxxxxxxx/
> > 
> > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
> > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx>
> > ---
> >  include/linux/bpf_verifier.h                  |   1 +
> >  kernel/bpf/verifier.c                         | 212 +++++++++++++++---
> >  .../selftests/bpf/progs/iters_task_vma.c      |   1 +
> >  3 files changed, 184 insertions(+), 30 deletions(-)
> > 
> 
> [...]
> 
> > +static int widen_imprecise_scalars(struct bpf_verifier_env *env,
> > +                                  struct bpf_verifier_state *old,
> > +                                  struct bpf_verifier_state *cur)
> > +{
> > +       struct bpf_func_state *fold, *fcur;
> > +       int i, fr;
> > +
> > +       reset_idmap_scratch(env);
> > +       for (fr = old->curframe; fr >= 0; fr--) {
> > +               fold = old->frame[fr];
> > +               fcur = cur->frame[fr];
> > +
> > +               for (i = 0; i < MAX_BPF_REG; i++)
> > +                       maybe_widen_reg(env,
> > +                                       &fold->regs[i],
> > +                                       &fcur->regs[i],
> > +                                       &env->idmap_scratch);
> > +
> > +               for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) {
> > +                       if (is_spilled_scalar_reg(&fold->stack[i]))
> 
> should this be !is_spilled_scalar_reg()?
> 
> but also your should make sure that fcur->stack[i] is also a spilled
> reg (it could be iter or dynptr, or just hard-coded ZERO or MISC stack
> slot, so accessing fcur->stack[i].spilled_ptr below might be invalid)

Yeap, this is a bug, thank you.
I'll add a test case and submit V3 today.

> 
> > +                               continue;
> 
> > +
> > +                       maybe_widen_reg(env,
> > +                                       &fold->stack[i].spilled_ptr,
> > +                                       &fcur->stack[i].spilled_ptr,
> > +                                       &env->idmap_scratch);
> > +               }
> > +       }
> > +       return 0;
> > +}
> > +
> 
> [...]