Re: [PATCH bpf-next 1/7] bpf: use common jump (instruction) history across all states

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 9, 2023 at 8:14 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Thu, Nov 9, 2023 at 7:20 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> >
> > On Mon, 2023-10-30 at 22:03 -0700, Andrii Nakryiko wrote:
> > > Instead of allocating and copying jump history each time we enqueue
> > > child verifier state, switch to a model where we use one common
> > > dynamically sized array of instruction jumps across all states.
> > >
> > > The key observation for proving this is correct is that jmp_history is
> > > only relevant while state is active, which means it either is a current
> > > state (and thus we are actively modifying jump history and no other
> > > state can interfere with us) or we are checkpointed state with some
> > > children still active (either enqueued or being current).
> > >
> > > In the latter case our portion of jump history is finalized and won't
> > > change or grow, so as long as we keep it immutable until the state is
> > > finalized, we are good.
> > >
> > > Now, when state is finalized and is put into state hash for potentially
> > > future pruning lookups, jump history is not used anymore. This is
> > > because jump history is only used by precision marking logic, and we
> > > never modify precision markings for finalized states.
> > >
> > > So, instead of each state having its own small jump history, we keep
> > > a global dynamically-sized jump history, where each state in current DFS
> > > path from root to active state remembers its portion of jump history.
> > > Current state can append to this history, but cannot modify any of its
> > > parent histories.
> > >
> > > Because the jmp_history array can be grown through realloc, states don't
> > > keep pointers, they instead maintain two indexes [start, end) into
> > > global jump history array. End is exclusive index, so start == end means
> > > there is no relevant jump history.
> > >
> > > This should eliminate a lot of allocations and minimize overall memory
> > > usage (but I haven't benchmarked on real hardware, and QEMU benchmarking
> > > is too noisy).
> > >
> > > Also, in the next patch we'll extend jump history to maintain additional
> > > markings for some instructions even if there was no jump, so in
> > > preparation for that call this thing a more generic "instruction history".
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> >
> > Nitpick: could you please add a comment somewhere in the code
> > (is_state_visited? pop_stack?) saying something like this:
> >
> >   states in the env->head happen to be sorted by insn_hist_end in
> >   descending order, so popping next state for verification poses no
> >   risk of overwriting history relevant for states remaining in
> >   env->head.
> >
> > Side note: this change would make it harder to change states traversal
> > order to something other than DFS, should we chose to do so.
>
> I have the same concern.
>
> When we discussed different algorithms to solve open-coded-iters/bpf_loop
> issue non-DFS ideas came up multiple times.
> To be fair I didn't like them, because I wanted to preserve DFS property :)
> but I feel sooner or later we will be forced to explore non-DFS.
> So I think this patch is no go. There is really no need to rely on DFS here.

If we ever break DFS property, we can easily change this. Or we can
even have a hybrid: as long as traversal preserves DFS property, we
use global shared history, but we can also optionally clone and have
our own history if necessary. It's a matter of adding optional
potentially NULL pointer to "local history". All this is very nicely
hidden away from "normal" code.

> Let instruction history consume more memory. It's a better long term trade off.

Before we decide this, let me collect stats on how much memory we use
for jmp_history with and without my change. I'll need to add a bit of
temporary code to veristat and verifier to collect this, but it
shouldn't take much effort. OK?

> We don't do strict DFS today.
> The speculative execution analysis is DFS, but it visits paths
> multiple times, so it's not a canonical DFS.

Not sure I follow. It's still a DFS, we just branch out more.

But again, let's look at data first. I'll get back with numbers soon.

> It probably doesn't break this particular insn_hist approach,
> but still feels too fragile to rely on DFS assumption long term.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux