On Tue, Mar 5, 2024 at 11:03 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Mon, Mar 4, 2024 at 8:52 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > When open code iterators, bpf_loop or may_goto is used the following two states > > are equivalent and safe to prune the search: > > > > cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) > > old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) > > > > In other words "exact" state match should ignore liveness and precision marks, > > since open coded iterator logic didn't complete their propagation, > > but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr > > is safe to rely on. > > > > Avoid doing such comparison when regular infinite loop detection logic is used, > > otherwise bounded loop logic will declare such "infinite loop" as false > > positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop(). > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > --- > > You haven't updated stacksafe() to work with enums (the code still > assumes bool), please adjust. great catch. > pw-bot: cr > > Also, I wonder if we actually need three cases. We need EXACT only for > infinite loop detection, right? I wonder if that infinite loop > detection is actually that useful in practice, I rarely encounter it > (if at all), and it constantly causes issues. > > What if we just dropped infinite loop detection logic altogether and > have NOT_EXACT vs RANGE_WITHIN logic only? The infinite loop logic is for better user experience. With: time ./test_verifier 145 #145/p calls: conditional call 6 OK Summary: 1 PASSED, 0 SKIPPED, 0 FAILED real 0m0.191s user 0m0.001s sys 0m0.053s without (on prod kernel): time ./test_verifier 145 #145/p calls: conditional call 6 FAIL Unexpected error message! EXP: infinite loop detected RES: BPF program is too large. Processed 1000001 insn verification time 5182443 usec stack depth 0+0 processed 1000001 insns (limit 1000000) max_states_per_insn 4 total_states 33336 peak_states 33336 mark_read 1 real 0m5.375s user 0m0.002s sys 0m4.700s I hope that eventually we can replace the 1M insn limit with "if the verifier is making progress, keep going". Then states_maybe_looping() will be replaced with something else. It should address non-converging bpf_loop/open iterators that hit 1M right now while walking the same insns over and over again. Something like "did we visit this insn 1k times" may be enough. > > kernel/bpf/verifier.c | 39 ++++++++++++++++++++++----------------- > > 1 file changed, 22 insertions(+), 17 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 226bb65f9c2c..74b55d5571c7 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -16254,8 +16254,8 @@ static int check_btf_info(struct bpf_verifier_env *env, > > } > > > > /* check %cur's range satisfies %old's */ > > -static bool range_within(struct bpf_reg_state *old, > > - struct bpf_reg_state *cur) > > +static bool range_within(const struct bpf_reg_state *old, > > + const struct bpf_reg_state *cur) > > { > > return old->umin_value <= cur->umin_value && > > old->umax_value >= cur->umax_value && > > @@ -16419,21 +16419,26 @@ static bool regs_exact(const struct bpf_reg_state *rold, > > check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap); > > } > > > > +enum exact_level { > > + NOT_EXACT, > > + EXACT, > > + RANGE_WITHIN > > +}; > > + > > /* Returns true if (rold safe implies rcur safe) */ > > static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > - struct bpf_reg_state *rcur, struct bpf_idmap *idmap, bool exact) > > + struct bpf_reg_state *rcur, struct bpf_idmap *idmap, > > + enum exact_level exact) > > { > > - if (exact) > > + if (exact == EXACT) > > return regs_exact(rold, rcur, idmap); > > > > - if (!(rold->live & REG_LIVE_READ)) > > + if (!(rold->live & REG_LIVE_READ) && exact != RANGE_WITHIN) > > imo, `exact == NOT_EXACT` is easier to follow (we excluded EXACT above) ok. > > /* explored state didn't use this */ > > return true; > > - if (rold->type == NOT_INIT) > > + if (rold->type == NOT_INIT && exact != RANGE_WITHIN) > > ditto > > > /* explored state can't have used this */ > > return true; > > - if (rcur->type == NOT_INIT) > > - return false; > > do we need to remove this? in which case will it do the wrong thing? Yes. Consider cur-type == rold->type == NOT_INIT exact == RANGE_WITHIN and regsafe() will return false, but should return true via: default: return regs_exact(rold, rcur, idmap); Having said that it's better to change above two 'if'-s to if (rold->type == NOT_INIT) { if (exact == NOT_EXACT || rcur->type == NOT_INIT) return true; } to avoid doing regs_exact() for the common case. > > > > /* Enforce that register types have to match exactly, including their > > * modifiers (like PTR_MAYBE_NULL, MEM_RDONLY, etc), as a general > > @@ -16468,7 +16473,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 && > > check_scalar_ids(rold->id, rcur->id, idmap); > > } > > - if (!rold->precise) > > + if (!rold->precise && exact != RANGE_WITHIN) > > same, I think explicit `exact == NOT_EXACT` is easier to follow ok