On Thu, Nov 9, 2023 at 11:49 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, Nov 9, 2023 at 11:29 AM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Thu, Nov 9, 2023 at 9:28 AM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > > If we ever break DFS property, we can easily change this. Or we can > > > even have a hybrid: as long as traversal preserves DFS property, we > > > use global shared history, but we can also optionally clone and have > > > our own history if necessary. It's a matter of adding optional > > > potentially NULL pointer to "local history". All this is very nicely > > > hidden away from "normal" code. > > > > If we can "easily change this" then let's make it last and optional patch. > > So we can revert in the future when we need to take non-DFS path. > > Ok, sounds good. I'll reorder and put it last, you can decide whether > to apply it or not that way. > > > > > > But again, let's look at data first. I'll get back with numbers soon. > > > > Sure. I think memory increase due to more tracking is ok. > > I suspect it won't cause 2x increase. Likely few %. > > The last time I checked the main memory hog is states stashed for pruning. > > So I'm back with data. See verifier.c changes I did at the bottom, > just to double check I'm not missing something major. I count the > number of allocations (but that's an underestimate that doesn't take > into account realloc), total number of instruction history entries for > entire program verification, and then also peak "depth" of instruction > history. Note that entries should be multiplied by 8 to get the amount > of bytes (and that's not counting per-allocation overhead). > > Here are top 20 results, sorted by number of allocs for Meta-internal, > Cilium, and selftests. BEFORE is without added STACK_ACCESS tracking > and STACK_ZERO optimization. AFTER is with all the patches of this > patch set applied. > > It's a few megabytes of memory allocation, which in itself is probably > not a big deal. But it's just an amount of unnecessary memory > allocations which is basically at least 2x of the total number of > states that we can save. And instead have just a few reallocs to size > global jump history to an order of magnitudes smaller peak entries. > > And if we ever decide to track more stuff similar to > INSNS_F_STACK_ACCESS, we won't have to worry about more allocations or > more memory usage, because the absolute worst case is our global > history will be up to 1 million entries tops. We can track some *code > path dependent* per-instruction information for *each simulated > instruction* easily without having to think twice about this. Which I > think is a nice liberating thought in itself justifying this change. > > Gmail butchered tables. See Github gist ([0]) for it properly formatted. [0] https://gist.github.com/anakryiko/04c5a3a5ae4ee672bd11d4b7b3d832f5 > > Stats counting diff: > > $ git show -- kernel > commit febebc9586c08820fa927b1628454b2709e98e3f (HEAD) > Author: Andrii Nakryiko <andrii@xxxxxxxxxx> > Date: Thu Nov 9 11:02:40 2023 -0800 > > [EXPERIMENT] bpf: add jump/insns history stats > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index b688043e5460..d0f25f36221e 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -2026,6 +2026,10 @@ static int pop_stack(struct bpf_verifier_env > *env, int *prev_insn_idx, > return -ENOENT; > > if (cur) { > + env->jmp_hist_peak = max(env->jmp_hist_peak, > cur->insn_hist_end); > + env->jmp_hist_total += cur->insn_hist_end - > cur->insn_hist_start; > + env->jmp_hist_allocs += 1; > + > err = copy_verifier_state(cur, &head->st); > if (err) > return err; > @@ -3648,6 +3653,8 @@ static int push_jmp_history(struct bpf_verifier_env *env, > p->idx = env->insn_idx; > p->prev_idx = env->prev_insn_idx; > cur->insn_hist_end++; > + > + env->jmp_hist_peak = max(env->jmp_hist_peak, cur->insn_hist_end); > return 0; > } > > @@ -17205,6 +17212,9 @@ static int is_state_visited(struct > bpf_verifier_env *env, int insn_idx) > WARN_ONCE(new->branches != 1, > "BUG is_state_visited:branches_to_explore=%d insn > %d\n", new->branches, insn_idx); > > + env->jmp_hist_total += cur->insn_hist_end - cur->insn_hist_start; > + env->jmp_hist_allocs += 1; > + > cur->parent = new; > cur->first_insn_idx = insn_idx; > cur->insn_hist_start = cur->insn_hist_end; > @@ -20170,10 +20180,12 @@ static void print_verification_stats(struct > bpf_verifier_env *env) > verbose(env, "\n"); > } > verbose(env, "processed %d insns (limit %d) max_states_per_insn %d " > - "total_states %d peak_states %d mark_read %d\n", > + "total_states %d peak_states %d mark_read %d " > + "jmp_allocs %d jmp_total %d jmp_peak %d\n", > env->insn_processed, BPF_COMPLEXITY_LIMIT_INSNS, > env->max_states_per_insn, env->total_states, > - env->peak_states, env->longest_mark_read_walk); > + env->peak_states, env->longest_mark_read_walk, > + env->jmp_hist_allocs, env->jmp_hist_total, env->jmp_hist_peak); > }