On Tue, Dec 5, 2023 at 2:01 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Tue, Dec 5, 2023 at 10:43 AM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote: > > > > Instead of allocating and copying instruction history each time we > > enqueue child verifier state, switch to a model where we use one common > > dynamically sized array of instruction history entries across all states. > > > > The key observation for proving this is correct is that instruction > > history is only relevant while state is active, which means it either is > > a current state (and thus we are actively modifying instruction history > > and no other state can interfere with us) or we are checkpointed state > > with some children still active (either enqueued or being current). > > > > In the latter case our portion of instruction history is finalized and > > won't change or grow, so as long as we keep it immutable until the state > > is finalized, we are good. > > > > Now, when state is finalized and is put into state hash for potentially > > future pruning lookups, instruction history is not used anymore. This is > > because instruction history is only used by precision marking logic, and > > we never modify precision markings for finalized states. > > > > So, instead of each state having its own small instruction history, we > > keep a global dynamically-sized instruction history, where each state in > > current DFS path from root to active state remembers its portion of > > instruction history. Current state can append to this history, but > > cannot modify any of its parent histories. > > > > Because the insn_hist array can be grown through realloc, states don't > > keep pointers, they instead maintain two indices, [start, end), into > > global instruction history array. End is exclusive index, so > > `start == end` means there is no relevant instruction history. > > > > This eliminates a lot of allocations and minimizes overall memory usage. > > I still think it's too dangerous to rely on DFS here. > The algo would certainly save allocs, but to save memory we can simply do: > > @@ -16128,6 +16128,7 @@ static void clean_verifier_state(struct > bpf_verifier_env *env, > /* all regs in this state in all frames were already marked */ > return; > > + clear_jmp_history(st); > for (i = 0; i <= st->curframe; i++) > clean_func_state(env, st->frame[i]); > } > > to achieve the same effect (it seems to work fine in tests). > I'm not sure how much memory it actually saves > (both above hack and this algo). > We probably need to augment veristat with cgroup memory.peak numbers > the way benchs/bench_htab_mem.c does. yep, would definitely be useful to know overall memory usage for verification > This will give us a clear signal whether we're saving any memory or not. > > Reusing the jmp history array this way also has a negative effect on kasan. > When arrays are separate any UAF or out-of-bounds are easier to catch. > > So I pushed the first 9 patches to bpf-next. thanks. I'll keep the patch around, and if/when we add memory stats to veristat, I'd be curious to measure the difference