Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...)

Eduard Zingerman <eddyz87@xxxxxxxxx> · Wed, 04 Oct 2023 00:52:07 +0300

On Tue, 2023-10-03 at 11:50 -0700, Alexei Starovoitov wrote:
> On Tue, Oct 3, 2023 at 8:33 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > 
> > When I put states to the loop stack I do copy_verifier_state() and
> > increase .branches counter for each state parent, so this should not
> > trigger warnings with update_branch_counts().
> 
> No warn doesn't mean it's correct.
> I suspect what your reschedule_loop_states() will increase
> branches from 0 to 1.
> Which is equivalent to increasing refcnt from zero. Not good.

Not really, here is how I modified bpf_verifier_env:

    struct bpf_verifier_state_stack {
        struct bpf_verifier_stack_elem *head; /* stack of verifier states to be processed */
        int size;                             /* number of states to be processed */
    };

    struct bpf_verifier_env {
        ...
        struct bpf_verifier_state_stack stack;
        struct bpf_verifier_state_stack loop_stack;
        ...
    }

Here env->stack is used for DFS traversal and env->loop_stack is used
to delay verification of the loop states.

When bpf_iter_next() is reached in state C and states_equal() shows
that there is potentially equivalent state V:
- copy C' of C is created sing copy_verifier_state(), it updates all
  branch counters up the ownership chain as with any other state;
- C' is put to env->loop_stack.

The reschedule_loop_states() [1] loops over states in loop_stack to
see if there are states that no longer have equivalent state, such
states are removed from env->loop_stack and put to env->stack.
This is done without branch counters update, so at any point in time
parent states have loop state accounted for in their branch counters.
Branch counter *never* transitions from 1 to 0 and than to 1 again
during this process.

[1] https://github.com/kernel-patches/bpf/compare/bpf-next_base...eddyz87:bpf:iters-bug-delayed-traversal#diff-edbb57adf10d1ce1fbb830a34fa92712fd01db1fbd9b6f2504001eb7bcc7b9d0R16823

> > Logically, any state
> > that has a loop state as it's grandchild is not verified to be safe
> > until loops steady state is achieved, thus such states could not be
> > used for states pruning and it is correct to keep their branch
> > counters > 0.
> 
> Correct.
>
> > propagate_liveness() and propagate_precision() should not be called
> > for the delayed states (work-in-progress I'll update the patch).
> > Behavior for non-delayed states should not be altered by these changes.
> 
> I disagree. Delayed states should not have different propagation rules.

Upon first discovery the loop state C is states_equal to some state V,
precision and read marks are not yet finalized and thus states_equal(V, C)
become false at some point. Also, C should not be used for states pruning.
Thus, there is no need to propagate liveness and precision from V to C
at this point.

It is possible and correct to propagate liveness and precision from V
to C when loop steady state is achieved, as at that point we know for
sure that C is a sub-state of V. However, currently loop states are
tracked globally and no states are processed after loops steady state
is reached, hence I don't do liveness and precision propagation.

On the other hand, if loop steady state would be tracked non-globally
e.g. for top level loop headers as I wrote in the other email today,
propagation of liveness and precision information would make sense,
as these states could be used for state pruning later.

> Iterators and callbacks should not be special.
> They are no different than bounded loops
> and their verification shouldn't be much different.
> Bounded loops have constant upper bound while
> iterators do not, so they have to converge based on state
> equivalence, but the verifier still has to do DFS and discover
> all possible paths, propagate precision and liveness before
> considering two states equivalent.
> It does this today for bounded loops and should do the same
> for iterators.
> The question is how to do it.
> Delaying states and breaking DFS assumptions is a non starter.

I think that absence of bound is a fundamental difference.

On each verification path of a bounded loop there is always a precise
value that at some point triggers loop termination. There is no need
for states equivalence checks (except for verification performance).

Each verification path of iterator or callback based loop is logically
an infinite series of states. In order to terminate verification of
such path it is necessary to:
- either find some prior state V identical to current state C;
- or find some prior state V such that current state C is a sub-state of V.

Terminating infinite series by identical state turns out to be very
limiting in practice.

On the other hand, sub-state check requires finalization of read and
precision marks in V. We've seen that such marks could be concealed at
an unknown depth. In addition, state V might be a parent state for
multiple grandchildren contributing to precision and read marks.
These grandchildren states are created when conditional jumps are
processed on possibly different iteration depths.

The necessity to visit all grandchildren states of V leads to
necessity of mixed DFS and BFS traversal of the loop states.

(In fact, this is not a special case but a generalization,
 bounded loops processing could be done within the same framework).

> I see now that my 2nd hack is still buggy.
> Differentiating state with branches vs looping_states counters
> doesn't seem to work in all cases.
> Your loop_state_deps1 demonstrates that. It's a great test.

I'm glad this test is useful.