Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...)

Eduard Zingerman <eddyz87@xxxxxxxxx> · Tue, 26 Sep 2023 18:55:50 +0300

On Mon, 2023-09-25 at 17:33 -0700, Andrii Nakryiko wrote:
[...]
> not working for this intermixed iterator case would be ok, I think
> most practical use cases would be a properly nested loops.
> 
> But, compiler can actually do something like even for a proper loop.
> E.g., something like below
> 
> for (; bpf_iter_num_next(&it); ) {
>    ....
> }
> 
> in assembly could be layed out as
> 
> 
> bpf_iter_num_next(&it);
> ...
> again:
> r0 = bpf_iter_num_next(&it)
> ...
> if (r0) goto again
> 
> 
> Or something along those lines. So these assumptions that the
> iterator's next() call is happening at the same instruction is
> problematic.

For the specific example above I think it would not be an issue
because there is still only one next() call in the loop =>
there would be no stalled next() transition states.

I checked disassembly for progs/iters.c and it appears that for each
program there compiler preserves nested loops structure such that:
- each loop is driven by a single next() call of a dedicated iterator;
- each nested loop exit goes through destroy() for corresponding
  iterator, meaning that outer loop's next will never see inner's loop
  iterator as active => no stalled states due to inner loop
  processing.

Of-course, I'm not a robot and might have missed something that would
break real implementation.

> > Option B. "Widening"
> > --------------------
> > 
> > The trivial fix for current .branches > 0 states comparison is to
> > force "exact" states comparisons for is_iter_next_insn() case:
> > 1. Ignore liveness marks, as those are not finalized yet;
> > 2. Ignore precision marks, as those are not finalized yet;
> > 3. Use regs_exact() for scalars comparison.
> > 
> > This, however, is very restrictive as it fails to verify trivial
> > programs like (iters_looping/simplest_loop):
> > 
> >     sum = 0;
> >     bpf_iter_num_new(&it, 0, 10);
> >     while (!(r0 = bpf_iter_num_next()))
> >       sum += *(u32 *)r0;
> >     bpf_iter_num_destroy(&it);
> > 
> > Here ever increasing bounds for "sum" prevent regs_exact() from ever
> > returning true.
> > 
> > One way to lift this limitation is to borrow something like the
> > "widening" trick from the abstract interpretation papers mentioned
> > earlier:
> > - suppose there is current state C and a visited is_iter_next_insn()
> >   state V with .branches > 0;
> > - suppose states_equal(V, C) returns true but exact states comparison
> >   returns false because there are scalar values not marked as precise
> >   that differ between V and C.
> > - copy C as C' and mark these scalars as __mark_reg_unbounded() in C';
> > - run verification for C':
> >   - if verification succeeds -- problem solved;
> >   - if verification fails -- discard C' and proceed from C.
> > 
> > Such algorithm should succeed for programs that don't use widened
> > values in "precise" context.
> > 
> > Looking at testcases failing with trivial fix I think that such
> > limitations would be similar to those already present (e.g. the big
> > comment in progs/iters.c:iter_obfuscate_counter would still apply).
> > 
> 
> This makes sense. I was originally thinking along those lines (and
> rejected it for myself), but in a more eager (and thus restrictive)
> way: for any scalar register where old and new register states are
> equivalent thanks to read mark or precision bit (or rather lack of
> precision bit), force that precision/mark in old state. But that is
> too pessimistic for cases where we truly needed to simulate N+1 due to
> state differences, while keeping the old state intact.

In other words there is a function states_equal' for comparison of
states when old{.branches > 0}, which differs from states_equal in
the following way:
- considers all registers read;
- considers all scalars precise.

> What you propose with temporary C -> C' seems to be along similar
> lines, but will give "a second chance" if something doesn't work out:
> we'll go on N+1 instead of just failing.

Right.

> But let's think about this case. Let's say in V R1 is set to 7, but
> imprecise. In C we have R1 as 14, also imprecise. According to your
> algorithm, we'll create C1 with R1 set to <any num>. Now I have two
> questions:
> 
> 1. How do we terminate C' validation? What happens when we get back to
> the next() call again and do states_equal() between V and C'? We just
> assume it's correct because C' has R1 as unbounded?

For the definition above states_equal'(V, C') is false, but because we
are at checkpoint on the next iteration we would get to
states_equal'(C', C'') where C'' is derived from C' and same rules
would apply. If we are lucky nothing would change and there would no
point in scheduling another traversal.

> 2. Assuming yes. What if later on, V's R1 is actually forced to be
> precise, so only if V's R1=P7, then it is safe. But we already
> concluded that C' is safe based on R1 being unbounded, right? Isn't
> that a violation?

My base assumption is that:
1. any instruction reachable from V should also be reachable from C';
2. and that it is also guaranteed to be visited from C'.
I cannot give a formal reasoning for (2) at the moment.

In case if this assumption holds, when verification proceeds from C'
it would eventually get to instruction for which R1 value is only safe
when R1=7, if so:
- verification of C' conjecture would fail;
- all states derived from C' would need to be removed from
  states stack / explored states;
- verification should proceed from C.
(And there is also a question of delaying read/precision propagation
 from unproven conjecture states to regular states).

(Also I'm still not sure whether to use regs_exact() for scalars
 comparison instead of precision logic in states_equal').