Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...)

Eduard Zingerman <eddyz87@xxxxxxxxx> · Mon, 25 Sep 2023 04:01:58 +0300

On Fri, 2023-09-22 at 13:52 -0700, Andrii Nakryiko wrote:
> > > I can't say I understand what exactly you are proposing and how you
> > > are going to determine these conservative precision marks. But I'd
> > > like to learn some new ideas, so please elaborate :)
> > 
> > I'll send a follow-up email, trying to figure out what to do with pointers.
> 
> Ok, sounds good.

(below explains why DFA was a bad idea, please proceed to the next
 section if that's not interesting).

My initial thought was that it would be possible to use a simple
live-variable DFA in order to compute conservative read and precise
marks. Like a textbook version:
- split code in basic blocks;
- for each instruction propagate liveness information backwards:
  - define USE(insn) to be a set of registers / stack slots read by
    this instruction;
  - define KILL(insn) to be a set of registers / stack slots written
    to by this instruction.
  - compute set of live variables before insn as:
    LIVE_in(insn) = USE(insn) U (LIVE_out(insn) / KILL(insn))
- compute basic block BB_out as a union of BB_in of each successor;
- etc... iterate in postorder until fixed point is reached.

However, this is a moot point because of presence of pointers.
For series of instructions like:
  "r1 = r10; r1 += -8; r1 = *(u64 *)(r1 + 0);"
Such algorithm would need to know possible range of values for r1.
Meaning that before executing liveness DFA some sort of value range
analysis would have to be run.
And these things might get complicated [1]
(I never implemented one, maybe it's not that bad).
So, basically that would be a mini-copy of the verifier but
implemented as a thing that computes fixed point instead of "tracing".
Probably a no-go.

[1] https://boxbase.org/entries/2017/oct/23/range_analysis_papers/

---

However, I think there are two more relatively simple options that
should give correct results.

Option A. "Exit or Loop"
------------------------

- Similar to Andrii's suggestion postpone exploration of
  bpf_iter_*_next() #N->#N+1 states for as long as there are other
  states to explore.
- When evaluating is_state_visited() for bpf_iter_*_next() #N->#N+1
  allow to use visited state with .branches > 0 to prune current
  state if:
  - states_equal() for visited and current states returns true;
  - all verfication paths starting from current instruction index with
    current iterator ref_obj_id either run to safe exit or run back to
    current instruction index with current iterator ref_obj_id.

The last rule is needed to:
- Ensure presence of read and precision marks in a state used for pruning.
- Give a meaning to the precise scalars comparisons in regsafe() which
  I think is currently missing for bpf_iter_*_next() .branches > 0 case.
  Precise scalars comparison for visited states with .branches == 0
  could be read as: "using any value from a specific range it is
  possible to reach safe exit".
  Precise scalars comparison for looping states from above
  could be read as: "using any value from a specific range it is
  possible to reach either safe exit or get back to this instruction".

This algorithm should be able to handle simple iteration patterns like:

    bpf_iter_*_new(&it1, ...)
    while (!bpf_iter_*_next(&it1)) { ... }
    bpf_iter_*_destroy(&it1)

And also handle nested iteration:

    bpf_iter_*_new(&it1, ...)
    while (!bpf_iter_*_next(&it1)) {
      bpf_iter_*_new(&it2, ...)
      while (!bpf_iter_*_next(&it2)) { ... }
      bpf_iter_*_destroy(&it2)
    }
    bpf_iter_*_destroy(&it1)

But it would fail to handle more complex patterns, e.g. interleaved
iteration:

    for (;;) {
      if (!bpf_iter_*_next(&it1))  // a
        break;
      if (!bpf_iter_*_next(&it2))  // b
        break;
      ...
    }

For any state originating in (a) there would always be a state in (b)
pending verification and vice versa. It would actually bail out even
if it1 == it2.

Option B. "Widening"
--------------------

The trivial fix for current .branches > 0 states comparison is to
force "exact" states comparisons for is_iter_next_insn() case:
1. Ignore liveness marks, as those are not finalized yet;
2. Ignore precision marks, as those are not finalized yet;
3. Use regs_exact() for scalars comparison.

This, however, is very restrictive as it fails to verify trivial
programs like (iters_looping/simplest_loop):

    sum = 0;
    bpf_iter_num_new(&it, 0, 10);
    while (!(r0 = bpf_iter_num_next()))
      sum += *(u32 *)r0;
    bpf_iter_num_destroy(&it);

Here ever increasing bounds for "sum" prevent regs_exact() from ever
returning true.

One way to lift this limitation is to borrow something like the
"widening" trick from the abstract interpretation papers mentioned
earlier:
- suppose there is current state C and a visited is_iter_next_insn()
  state V with .branches > 0;
- suppose states_equal(V, C) returns true but exact states comparison
  returns false because there are scalar values not marked as precise
  that differ between V and C.
- copy C as C' and mark these scalars as __mark_reg_unbounded() in C';
- run verification for C':
  - if verification succeeds -- problem solved;
  - if verification fails -- discard C' and proceed from C.

Such algorithm should succeed for programs that don't use widened
values in "precise" context.

Looking at testcases failing with trivial fix I think that such
limitations would be similar to those already present (e.g. the big
comment in progs/iters.c:iter_obfuscate_counter would still apply).

---

Option B appears to be simpler to implement so I'm proceeding with it.
Will share progress on Monday. Please let me know if there are any
obvious flaws with what I shared above.

Thanks,
Eduard