Re: [PATCH v5 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Tue, 5 Mar 2024 13:18:18 -0800

On Tue, Mar 5, 2024 at 11:03 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Mon, Mar 4, 2024 at 8:52 PM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > From: Alexei Starovoitov <ast@xxxxxxxxxx>
> >
> > When open code iterators, bpf_loop or may_goto is used the following two states
> > are equivalent and safe to prune the search:
> >
> > cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
> > old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
> >
> > In other words "exact" state match should ignore liveness and precision marks,
> > since open coded iterator logic didn't complete their propagation,
> > but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr
> > is safe to rely on.
> >
> > Avoid doing such comparison when regular infinite loop detection logic is used,
> > otherwise bounded loop logic will declare such "infinite loop" as false
> > positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop().
> >
> > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
> > ---
>
> You haven't updated stacksafe() to work with enums (the code still
> assumes bool), please adjust.

great catch.

> pw-bot: cr
>
> Also, I wonder if we actually need three cases. We need EXACT only for
> infinite loop detection, right? I wonder if that infinite loop
> detection is actually that useful in practice, I rarely encounter it
> (if at all), and it constantly causes issues.
>
> What if we just dropped infinite loop detection logic altogether and
> have NOT_EXACT vs RANGE_WITHIN logic only?

The infinite loop logic is for better user experience.
With:
time ./test_verifier 145
#145/p calls: conditional call 6 OK
Summary: 1 PASSED, 0 SKIPPED, 0 FAILED

real    0m0.191s
user    0m0.001s
sys    0m0.053s

without (on prod kernel):
time ./test_verifier 145
#145/p calls: conditional call 6 FAIL
Unexpected error message!
    EXP: infinite loop detected
    RES: BPF program is too large. Processed 1000001 insn
verification time 5182443 usec
stack depth 0+0
processed 1000001 insns (limit 1000000) max_states_per_insn 4
total_states 33336 peak_states 33336 mark_read 1

real    0m5.375s
user    0m0.002s
sys    0m4.700s

I hope that eventually we can replace the 1M insn limit with
"if the verifier is making progress, keep going".
Then states_maybe_looping() will be replaced with something else.
It should address non-converging bpf_loop/open iterators
that hit 1M right now while walking the same insns over and over again.
Something like "did we visit this insn 1k times" may be enough.

> >  kernel/bpf/verifier.c | 39 ++++++++++++++++++++++-----------------
> >  1 file changed, 22 insertions(+), 17 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 226bb65f9c2c..74b55d5571c7 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -16254,8 +16254,8 @@ static int check_btf_info(struct bpf_verifier_env *env,
> >  }
> >
> >  /* check %cur's range satisfies %old's */
> > -static bool range_within(struct bpf_reg_state *old,
> > -                        struct bpf_reg_state *cur)
> > +static bool range_within(const struct bpf_reg_state *old,
> > +                        const struct bpf_reg_state *cur)
> >  {
> >         return old->umin_value <= cur->umin_value &&
> >                old->umax_value >= cur->umax_value &&
> > @@ -16419,21 +16419,26 @@ static bool regs_exact(const struct bpf_reg_state *rold,
> >                check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap);
> >  }
> >
> > +enum exact_level {
> > +       NOT_EXACT,
> > +       EXACT,
> > +       RANGE_WITHIN
> > +};
> > +
> >  /* Returns true if (rold safe implies rcur safe) */
> >  static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > -                   struct bpf_reg_state *rcur, struct bpf_idmap *idmap, bool exact)
> > +                   struct bpf_reg_state *rcur, struct bpf_idmap *idmap,
> > +                   enum exact_level exact)
> >  {
> > -       if (exact)
> > +       if (exact == EXACT)
> >                 return regs_exact(rold, rcur, idmap);
> >
> > -       if (!(rold->live & REG_LIVE_READ))
> > +       if (!(rold->live & REG_LIVE_READ) && exact != RANGE_WITHIN)
>
> imo, `exact == NOT_EXACT` is easier to follow (we excluded EXACT above)

ok.

> >                 /* explored state didn't use this */
> >                 return true;
> > -       if (rold->type == NOT_INIT)
> > +       if (rold->type == NOT_INIT && exact != RANGE_WITHIN)
>
> ditto
>
> >                 /* explored state can't have used this */
> >                 return true;
> > -       if (rcur->type == NOT_INIT)
> > -               return false;
>
> do we need to remove this? in which case will it do the wrong thing?

Yes.
Consider cur-type == rold->type == NOT_INIT exact == RANGE_WITHIN
and regsafe() will return false, but should return true via:
        default:
                return regs_exact(rold, rcur, idmap);

Having said that it's better to change above two 'if'-s to
if (rold->type == NOT_INIT) {
    if (exact == NOT_EXACT || rcur->type == NOT_INIT)
      return true;
}
to avoid doing regs_exact() for the common case.

> >
> >         /* Enforce that register types have to match exactly, including their
> >          * modifiers (like PTR_MAYBE_NULL, MEM_RDONLY, etc), as a general
> > @@ -16468,7 +16473,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> >                         return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
> >                                check_scalar_ids(rold->id, rcur->id, idmap);
> >                 }
> > -               if (!rold->precise)
> > +               if (!rold->precise && exact != RANGE_WITHIN)
>
> same, I think explicit `exact == NOT_EXACT` is easier to follow

ok