Re: [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 23 Oct 2023 09:16:16 -0700

On Mon, Oct 23, 2023 at 7:47 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Sat, 2023-10-21 at 21:28 -0700, Andrii Nakryiko wrote:
> > On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > >
> > > It turns out that .branches > 0 in is_state_visited() is not a
> > > sufficient condition to identify if two verifier states form a loop
> > > when iterators convergence is computed. This commit adds logic to
> > > distinguish situations like below:
> > >
> > >  (I)            initial       (II)            initial
> > >                   |                             |
> > >                   V                             V
> > >      .---------> hdr                           ..
> > >      |            |                             |
> > >      |            V                             V
> > >      |    .------...                    .------..
> > >      |    |       |                     |       |
> > >      |    V       V                     V       V
> > >      |   ...     ...               .-> hdr     ..
> > >      |    |       |                |    |       |
> > >      |    V       V                |    V       V
> > >      |   succ <- cur               |   succ <- cur
> > >      |    |                        |    |
> > >      |    V                        |    V
> > >      |   ...                       |   ...
> > >      |    |                        |    |
> > >      '----'                        '----'
> > >
> > > For both (I) and (II) successor 'succ' of the current state 'cur' was
> > > previously explored and has branches count at 0. However, loop entry
> > > 'hdr' corresponding to 'succ' might be a part of current DFS path.
> > > If that is the case 'succ' and 'cur' are members of the same loop
> > > and have to be compared exactly.
> > >
> > > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> > > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
> > > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx>
> > > ---
> > >  include/linux/bpf_verifier.h |  15 +++
> > >  kernel/bpf/verifier.c        | 207 ++++++++++++++++++++++++++++++++++-
> > >  2 files changed, 218 insertions(+), 4 deletions(-)
> > >
> >
> > LGTM, but see the note below about state being its own loop_entry. It
> > feels like a bit better approach would be to use "loop_entry_ref_cnt"
> > instead of just a bool used_as_loop_entry, and do a proper accounting
> > when child state is freed and when propagating loop_entries. But
> > perhaps that can be done in a follow up, if you think it's a good
> > idea.
>
> I though about reference counting but decided to use flag instead
> because it's a bit simpler. In any case the full mechanism is
> opportunistic and having a few stale states shouldn't be a big deal,
> those would be freed when syscall exits.
> I'll make ref_cnt version and send it as a follow-up, so we can decide
> looking at the code whether to peek it or drop it.
>
> >
> > Reviewed-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> >
> > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > index 38b788228594..24213a99cc79 100644
> > > --- a/include/linux/bpf_verifier.h
> > > +++ b/include/linux/bpf_verifier.h
> >
> [...]
> > > @@ -16825,7 +17023,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> > >                          * speed up verification
> > >                          */
> > >                         *pprev = sl->next;
> > > -                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE) {
> > > +                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE &&
> > > +                           !sl->state.used_as_loop_entry) {
> >
> > In get_loop_entry() you have an additional `topmost !=
> > topmost->loop_entry` check, suggesting that state can be its own
> > loop_entry. Can that happen?
>
> It can, e.g. in the following loop:
>
>     loop: r1 = r10;
>           r1 += -8;
>        --- checkpoint here ---
>           call %[bpf_iter_num_next];
>           goto loop;
>
>
> > And if yes, should we account for that here?
>
> With flag I don't think we need to account for it here because it's a
> best-effort thing anyways. (E.g. it misses situation when something
> was marked as loop entry initially than entry upper in DFS chain had
> been found). With reference count -- yes, it would me more important.
>

OK, no big deal, I wanted to make sure we won't leak some states, if
they are their own loop_entry.