Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...)

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Mon, 2 Oct 2023 19:00:41 -0700

On Mon, Oct 2, 2023 at 5:05 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Oct 2, 2023 at 10:18 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> >
> > On Mon, 2023-10-02 at 09:29 -0700, Alexei Starovoitov wrote:
> > [...]
> > > > I'd like to argue about B "widening" for a bit, as I think it might be
> > > > interesting in general, and put A aside for now. The algorithm for
> > > > widening looks as follows:
> > > > - In is_states_equal() for (sl->state.branches > 0 && is_iter_next_insn()) case:
> > > >   - Check if states are equal exactly:
> > > >     - ignore liveness marks on old state;
> > > >     - demand same type for registers and stack slots;
> > > >     - ignore precision marks, instead compare scalars using
> > > >       regs_exact() [this differs from my previous emails, I'm now sure
> > > >       that for this scheme to be correct regs_exact() is needed].
> > > >   - If there is an exact match then follow "hit" branch. The idea
> > > >     being that visiting exactly the same state can't produce new
> > > >     execution paths (like with graph traversal).
> > >
> > > Right. Exactly the same C state won't produce new paths
> > > as seen in visited state V, but
> > > if C==V at the same insn indx it means we're in the infinite loop.
> >
> > This is true in general, but for bpf_iter_next() we have a guarantee
> > that iteration would end eventually.
> >
> > > > More formally, before pruning potential looping states we need to
> > > > make sure that all precision and read marks are in place.
> > > > To achieve this:
> > > > - Process states from env->head while those are available, in case if
> > > >   potential looping state (is_states_equal()) is reached put it to a
> > > >   separate queue.
> > > > - Once all env->head states are processed the only source for new read
> > > >   and precision marks is in postponed looping states, some of which
> > > >   might not be is_states_equal() anymore. Submit each such state for
> > > >   verification until fixed point is reached (repeating steps for
> > > >   env->head processing).
> > >
> > > Comparing if (sl->state.branches) makes sense to find infinite loop.
> > > It's waste for the verifier to consider visited state V with branches > 0
> > > for pruning.
> > > The safety of V is unknown. The lack of liveness and precision
> > > is just one part. The verifier didn't conclude that V is safe yet.
> > > The current state C being equivalent to V doesn't tell us anything.
> > >
> > > If infinite loop detection logic trips us, let's disable it.
> > > I feel the fix should be in process_iter_next_call() to somehow
> > > make it stop doing push_stack() when states_equal(N-1, N-2).
> >
> > Consider that we get to the environment state where:
> > - all env->head states are exhausted;
> > - all potentially looping states (stored in as a separate set of
> >   states instead of env->head) are states_equal() to some already
> >   explored state.
> >
> > I argue that if such environment state is reached the program should
> > be safe, because:
> > - Each looping state L is a sub-state of some explored state V and
> >   every path from V leads to either safe exit or another loop.
> > - Iterator loops are guaranteed to exit eventually.
>
> It sounds correct, but I don't like that the new mechanism
> with two stacks of states completely changes the way the verifier works.
> The check you proposed:
> if (env->stack_size != 0)
>       push_iter_stack()
> rings alarm bells.
>
> env->stack_size == 0 (same as env->head exhausted) means we're done
> with verification (ignoring bpf_exit in callbacks and subprogs).
> So above check looks like a hack for something that I don't understand yet.
> Also there could be branches in the code before and after iter loop.
> With "opportunistic" states_equal() for looping states and delayed
> reschedule_loop_states() to throw states back at the verifier
> the whole verification model is non comprehensible (at least to me).
> The stack + iter_stack + reschedule_loop_states means that in the following:
> foo()
> {
>   br1 // some if() {...} block
>   loop {
>     br2
>   }
>   br3
> }
>
> the normal verifier pop_stack logic will check br3 and br1,
> but br2 may or may not be checked depending on "luck" of states_equal
> and looping states that will be in iter_stack.
> Then the verifier will restart from checking loop-ing states.
> If they somehow go past the end of the loop all kinds of things go crazy.
> update_branch_counts() might warn, propagate_liveness, propagate_precision
> will do nonsensical things.
> This out-of-order state processing distorts the existing model so
> much that I don't see how we can reason about these two stacks verification.
>
>
> I think the cleaner solution is to keep current single stack model.
> In the above example the verifier would reach the end, then check br3,
> then check br2,
> then we need to split branches counter somehow, so that we can
> compare loop iter states with previous visited states that are known
> to be safe.
> In visited states we explored everything in br3 and in br2,
> so no concerns that some path inside the loop or after the loop
> missed precision or liveness.
> Maybe we can split branches counter into branches due to 'if' blocks
> and branches due to process_iter_next_call().
> If there are pending process_iter_next_call-caused states it's still
> ok to call states_equal on such visited states.
>
> I could be missing something, of course.

Attached patch is what I meant.
It needs more work, but gives an idea.
simple iterators load fine and it correctly
detects unsafe code in num_iter_bug() example where you have
0xdeadbeef in R1.
Attachment:
0001-iter-hack.patch

Description: Binary data