Re: [PATCH v4 bpf-next 10/10] bpf: use common instruction history across all states

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 5, 2023 at 2:01 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Tue, Dec 5, 2023 at 10:43 AM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote:
> >
> > Instead of allocating and copying instruction history each time we
> > enqueue child verifier state, switch to a model where we use one common
> > dynamically sized array of instruction history entries across all states.
> >
> > The key observation for proving this is correct is that instruction
> > history is only relevant while state is active, which means it either is
> > a current state (and thus we are actively modifying instruction history
> > and no other state can interfere with us) or we are checkpointed state
> > with some children still active (either enqueued or being current).
> >
> > In the latter case our portion of instruction history is finalized and
> > won't change or grow, so as long as we keep it immutable until the state
> > is finalized, we are good.
> >
> > Now, when state is finalized and is put into state hash for potentially
> > future pruning lookups, instruction history is not used anymore. This is
> > because instruction history is only used by precision marking logic, and
> > we never modify precision markings for finalized states.
> >
> > So, instead of each state having its own small instruction history, we
> > keep a global dynamically-sized instruction history, where each state in
> > current DFS path from root to active state remembers its portion of
> > instruction history.  Current state can append to this history, but
> > cannot modify any of its parent histories.
> >
> > Because the insn_hist array can be grown through realloc, states don't
> > keep pointers, they instead maintain two indices, [start, end), into
> > global instruction history array. End is exclusive index, so
> > `start == end` means there is no relevant instruction history.
> >
> > This eliminates a lot of allocations and minimizes overall memory usage.
>
> I still think it's too dangerous to rely on DFS here.
> The algo would certainly save allocs, but to save memory we can simply do:
>
> @@ -16128,6 +16128,7 @@ static void clean_verifier_state(struct
> bpf_verifier_env *env,
>                 /* all regs in this state in all frames were already marked */
>                 return;
>
> +       clear_jmp_history(st);
>         for (i = 0; i <= st->curframe; i++)
>                 clean_func_state(env, st->frame[i]);
>  }
>
> to achieve the same effect (it seems to work fine in tests).
> I'm not sure how much memory it actually saves
> (both above hack and this algo).
> We probably need to augment veristat with cgroup memory.peak numbers
> the way benchs/bench_htab_mem.c does.

yep, would definitely be useful to know overall memory usage for verification

> This will give us a clear signal whether we're saving any memory or not.
>
> Reusing the jmp history array this way also has a negative effect on kasan.
> When arrays are separate any UAF or out-of-bounds are easier to catch.
>
> So I pushed the first 9 patches to bpf-next.

thanks. I'll keep the patch around, and if/when we add memory stats to
veristat, I'd be curious to measure the difference





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux