On Sat, 2023-10-21 at 21:28 -0700, Andrii Nakryiko wrote: > On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > It turns out that .branches > 0 in is_state_visited() is not a > > sufficient condition to identify if two verifier states form a loop > > when iterators convergence is computed. This commit adds logic to > > distinguish situations like below: > > > > (I) initial (II) initial > > | | > > V V > > .---------> hdr .. > > | | | > > | V V > > | .------... .------.. > > | | | | | > > | V V V V > > | ... ... .-> hdr .. > > | | | | | | > > | V V | V V > > | succ <- cur | succ <- cur > > | | | | > > | V | V > > | ... | ... > > | | | | > > '----' '----' > > > > For both (I) and (II) successor 'succ' of the current state 'cur' was > > previously explored and has branches count at 0. However, loop entry > > 'hdr' corresponding to 'succ' might be a part of current DFS path. > > If that is the case 'succ' and 'cur' are members of the same loop > > and have to be compared exactly. > > > > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> > > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> > > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > --- > > include/linux/bpf_verifier.h | 15 +++ > > kernel/bpf/verifier.c | 207 ++++++++++++++++++++++++++++++++++- > > 2 files changed, 218 insertions(+), 4 deletions(-) > > > > LGTM, but see the note below about state being its own loop_entry. It > feels like a bit better approach would be to use "loop_entry_ref_cnt" > instead of just a bool used_as_loop_entry, and do a proper accounting > when child state is freed and when propagating loop_entries. But > perhaps that can be done in a follow up, if you think it's a good > idea. I though about reference counting but decided to use flag instead because it's a bit simpler. In any case the full mechanism is opportunistic and having a few stale states shouldn't be a big deal, those would be freed when syscall exits. I'll make ref_cnt version and send it as a follow-up, so we can decide looking at the code whether to peek it or drop it. > > Reviewed-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > index 38b788228594..24213a99cc79 100644 > > --- a/include/linux/bpf_verifier.h > > +++ b/include/linux/bpf_verifier.h > [...] > > @@ -16825,7 +17023,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) > > * speed up verification > > */ > > *pprev = sl->next; > > - if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE) { > > + if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE && > > + !sl->state.used_as_loop_entry) { > > In get_loop_entry() you have an additional `topmost != > topmost->loop_entry` check, suggesting that state can be its own > loop_entry. Can that happen? It can, e.g. in the following loop: loop: r1 = r10; r1 += -8; --- checkpoint here --- call %[bpf_iter_num_next]; goto loop; > And if yes, should we account for that here? With flag I don't think we need to account for it here because it's a best-effort thing anyways. (E.g. it misses situation when something was marked as loop entry initially than entry upper in DFS chain had been found). With reference count -- yes, it would me more important.