Re: [PATCH v5 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Tue, 5 Mar 2024 13:57:10 -0800

On Tue, Mar 5, 2024 at 1:18 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Tue, Mar 5, 2024 at 11:03 AM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Mon, Mar 4, 2024 at 8:52 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > From: Alexei Starovoitov <ast@xxxxxxxxxx>
> > >
> > > When open code iterators, bpf_loop or may_goto is used the following two states
> > > are equivalent and safe to prune the search:
> > >
> > > cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
> > > old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
> > >
> > > In other words "exact" state match should ignore liveness and precision marks,
> > > since open coded iterator logic didn't complete their propagation,
> > > but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr
> > > is safe to rely on.
> > >
> > > Avoid doing such comparison when regular infinite loop detection logic is used,
> > > otherwise bounded loop logic will declare such "infinite loop" as false
> > > positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop().
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
> > > ---
> >
> > You haven't updated stacksafe() to work with enums (the code still
> > assumes bool), please adjust.
>
> great catch.
>
> > pw-bot: cr
> >
> > Also, I wonder if we actually need three cases. We need EXACT only for
> > infinite loop detection, right? I wonder if that infinite loop
> > detection is actually that useful in practice, I rarely encounter it
> > (if at all), and it constantly causes issues.
> >
> > What if we just dropped infinite loop detection logic altogether and
> > have NOT_EXACT vs RANGE_WITHIN logic only?
>
> The infinite loop logic is for better user experience.
> With:
> time ./test_verifier 145
> #145/p calls: conditional call 6 OK
> Summary: 1 PASSED, 0 SKIPPED, 0 FAILED
>
> real    0m0.191s
> user    0m0.001s
> sys    0m0.053s
>
> without (on prod kernel):
> time ./test_verifier 145
> #145/p calls: conditional call 6 FAIL
> Unexpected error message!
>     EXP: infinite loop detected
>     RES: BPF program is too large. Processed 1000001 insn
> verification time 5182443 usec
> stack depth 0+0
> processed 1000001 insns (limit 1000000) max_states_per_insn 4
> total_states 33336 peak_states 33336 mark_read 1
>
> real    0m5.375s
> user    0m0.002s
> sys    0m4.700s
>
> I hope that eventually we can replace the 1M insn limit with
> "if the verifier is making progress, keep going".
> Then states_maybe_looping() will be replaced with something else.
> It should address non-converging bpf_loop/open iterators
> that hit 1M right now while walking the same insns over and over again.
> Something like "did we visit this insn 1k times" may be enough.

Yeah, my hopes were very low for getting rid of that check, but it
always is causing problems... Alright, fine, we'll have to live with a
more complicated tri-state exactness check, unfortunately.

>
> > >  kernel/bpf/verifier.c | 39 ++++++++++++++++++++++-----------------
> > >  1 file changed, 22 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 226bb65f9c2c..74b55d5571c7 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -16254,8 +16254,8 @@ static int check_btf_info(struct bpf_verifier_env *env,
> > >  }
> > >
> > >  /* check %cur's range satisfies %old's */
> > > -static bool range_within(struct bpf_reg_state *old,
> > > -                        struct bpf_reg_state *cur)
> > > +static bool range_within(const struct bpf_reg_state *old,
> > > +                        const struct bpf_reg_state *cur)
> > >  {
> > >         return old->umin_value <= cur->umin_value &&
> > >                old->umax_value >= cur->umax_value &&
> > > @@ -16419,21 +16419,26 @@ static bool regs_exact(const struct bpf_reg_state *rold,
> > >                check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap);
> > >  }
> > >
> > > +enum exact_level {
> > > +       NOT_EXACT,
> > > +       EXACT,
> > > +       RANGE_WITHIN
> > > +};
> > > +
> > >  /* Returns true if (rold safe implies rcur safe) */
> > >  static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > > -                   struct bpf_reg_state *rcur, struct bpf_idmap *idmap, bool exact)
> > > +                   struct bpf_reg_state *rcur, struct bpf_idmap *idmap,
> > > +                   enum exact_level exact)
> > >  {
> > > -       if (exact)
> > > +       if (exact == EXACT)
> > >                 return regs_exact(rold, rcur, idmap);
> > >
> > > -       if (!(rold->live & REG_LIVE_READ))
> > > +       if (!(rold->live & REG_LIVE_READ) && exact != RANGE_WITHIN)
> >
> > imo, `exact == NOT_EXACT` is easier to follow (we excluded EXACT above)
>
> ok.
>
> > >                 /* explored state didn't use this */
> > >                 return true;
> > > -       if (rold->type == NOT_INIT)
> > > +       if (rold->type == NOT_INIT && exact != RANGE_WITHIN)
> >
> > ditto
> >
> > >                 /* explored state can't have used this */
> > >                 return true;
> > > -       if (rcur->type == NOT_INIT)
> > > -               return false;
> >
> > do we need to remove this? in which case will it do the wrong thing?
>
> Yes.
> Consider cur-type == rold->type == NOT_INIT exact == RANGE_WITHIN
> and regsafe() will return false, but should return true via:
>         default:
>                 return regs_exact(rold, rcur, idmap);

oh, very subtle and error-prone, so yeah, I like the explicit check below more

>
> Having said that it's better to change above two 'if'-s to
> if (rold->type == NOT_INIT) {
>     if (exact == NOT_EXACT || rcur->type == NOT_INIT)
>       return true;
> }
> to avoid doing regs_exact() for the common case.

sure, give it a try, but I got to say that it is now quite hard to
follow the logic with the tri-state "exactness"

>
> > >
> > >         /* Enforce that register types have to match exactly, including their
> > >          * modifiers (like PTR_MAYBE_NULL, MEM_RDONLY, etc), as a general
> > > @@ -16468,7 +16473,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > >                         return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
> > >                                check_scalar_ids(rold->id, rcur->id, idmap);
> > >                 }
> > > -               if (!rold->precise)
> > > +               if (!rold->precise && exact != RANGE_WITHIN)
> >
> > same, I think explicit `exact == NOT_EXACT` is easier to follow
>
> ok