Re: [PATCH bpf-next 1/7] bpf: use common jump (instruction) history across all states

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 9 Nov 2023 08:13:45 -0800

On Thu, Nov 9, 2023 at 7:20 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Mon, 2023-10-30 at 22:03 -0700, Andrii Nakryiko wrote:
> > Instead of allocating and copying jump history each time we enqueue
> > child verifier state, switch to a model where we use one common
> > dynamically sized array of instruction jumps across all states.
> >
> > The key observation for proving this is correct is that jmp_history is
> > only relevant while state is active, which means it either is a current
> > state (and thus we are actively modifying jump history and no other
> > state can interfere with us) or we are checkpointed state with some
> > children still active (either enqueued or being current).
> >
> > In the latter case our portion of jump history is finalized and won't
> > change or grow, so as long as we keep it immutable until the state is
> > finalized, we are good.
> >
> > Now, when state is finalized and is put into state hash for potentially
> > future pruning lookups, jump history is not used anymore. This is
> > because jump history is only used by precision marking logic, and we
> > never modify precision markings for finalized states.
> >
> > So, instead of each state having its own small jump history, we keep
> > a global dynamically-sized jump history, where each state in current DFS
> > path from root to active state remembers its portion of jump history.
> > Current state can append to this history, but cannot modify any of its
> > parent histories.
> >
> > Because the jmp_history array can be grown through realloc, states don't
> > keep pointers, they instead maintain two indexes [start, end) into
> > global jump history array. End is exclusive index, so start == end means
> > there is no relevant jump history.
> >
> > This should eliminate a lot of allocations and minimize overall memory
> > usage (but I haven't benchmarked on real hardware, and QEMU benchmarking
> > is too noisy).
> >
> > Also, in the next patch we'll extend jump history to maintain additional
> > markings for some instructions even if there was no jump, so in
> > preparation for that call this thing a more generic "instruction history".
> >
> > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
>
> Nitpick: could you please add a comment somewhere in the code
> (is_state_visited? pop_stack?) saying something like this:
>
>   states in the env->head happen to be sorted by insn_hist_end in
>   descending order, so popping next state for verification poses no
>   risk of overwriting history relevant for states remaining in
>   env->head.
>
> Side note: this change would make it harder to change states traversal
> order to something other than DFS, should we chose to do so.

I have the same concern.

When we discussed different algorithms to solve open-coded-iters/bpf_loop
issue non-DFS ideas came up multiple times.
To be fair I didn't like them, because I wanted to preserve DFS property :)
but I feel sooner or later we will be forced to explore non-DFS.
So I think this patch is no go. There is really no need to rely on DFS here.
Let instruction history consume more memory. It's a better long term trade off.
We don't do strict DFS today.
The speculative execution analysis is DFS, but it visits paths
multiple times, so it's not a canonical DFS.
It probably doesn't break this particular insn_hist approach,
but still feels too fragile to rely on DFS assumption long term.