Re: [PATCH bpf-next 03/10] bpf: encapsulate precision backtracking bookkeeping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 3, 2023 at 7:53 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> Few early comments so far...
>
> On Tue, Apr 25, 2023 at 04:49:04PM -0700, Andrii Nakryiko wrote:
> > Add struct backtrack_state and straightforward API around it to keep
> > track of register and stack masks used and maintained during precision
> > backtracking process. Having this logic separately allow to keep
> > high-level backtracking algorithm cleaner, but also it sets us up to
> > cleanly keep track of register and stack masks per frame, allowing (with
> > some further logic adjustments) to perform precision backpropagation
> > across multiple frames (i.e., subprog calls).
> >
> > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > ---
> >  include/linux/bpf_verifier.h |  15 ++
> >  kernel/bpf/verifier.c        | 258 ++++++++++++++++++++++++++---------
> >  2 files changed, 206 insertions(+), 67 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 3dd29a53b711..185bfaf0ec6b 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -238,6 +238,10 @@ enum bpf_stack_slot_type {
> >
> >  #define BPF_REG_SIZE 8       /* size of eBPF register in bytes */
> >
> > +#define BPF_REGMASK_ARGS ((1 << BPF_REG_1) | (1 << BPF_REG_2) | \
> > +                       (1 << BPF_REG_3) | (1 << BPF_REG_4) | \
> > +                       (1 << BPF_REG_5))
> > +
> >  #define BPF_DYNPTR_SIZE              sizeof(struct bpf_dynptr_kern)
> >  #define BPF_DYNPTR_NR_SLOTS          (BPF_DYNPTR_SIZE / BPF_REG_SIZE)
> >
> > @@ -541,6 +545,16 @@ struct bpf_subprog_info {
> >       bool is_async_cb;
> >  };
> >
> > +struct bpf_verifier_env;
> > +
> > +struct backtrack_state {
> > +     struct bpf_verifier_env *env;
> > +     u32 frame;
> > +     u32 bitcnt;
> > +     u32 reg_masks[MAX_CALL_FRAMES];
> > +     u64 stack_masks[MAX_CALL_FRAMES];
> > +};
> > +
> >  /* single container for all structs
> >   * one verifier_env per bpf_check() call
> >   */
> > @@ -578,6 +592,7 @@ struct bpf_verifier_env {
> >               int *insn_stack;
> >               int cur_stack;
> >       } cfg;
> > +     struct backtrack_state bt;
> >       u32 pass_cnt; /* number of times do_check() was called */
> >       u32 subprog_cnt;
> >       /* number of instructions analyzed by the verifier */
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index fea6fe4acba2..1cb89fe00507 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -1254,6 +1254,12 @@ static bool is_spilled_reg(const struct bpf_stack_state *stack)
> >       return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL;
> >  }
> >
> > +static bool is_spilled_scalar_reg(const struct bpf_stack_state *stack)
> > +{
> > +     return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL &&
> > +            stack->spilled_ptr.type == SCALAR_VALUE;
> > +}
> > +
> >  static void scrub_spilled_slot(u8 *stype)
> >  {
> >       if (*stype != STACK_INVALID)
> > @@ -3144,12 +3150,137 @@ static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn)
> >       return btf_name_by_offset(desc_btf, func->name_off);
> >  }
> >
> > +static inline void bt_init(struct backtrack_state *bt, u32 frame)
> > +{
> > +     bt->frame = frame;
> > +}
> > +
> > +static inline void bt_reset(struct backtrack_state *bt)
> > +{
> > +     struct bpf_verifier_env *env = bt->env;
> > +     memset(bt, 0, sizeof(*bt));
> > +     bt->env = env;
> > +}
> > +
> > +static inline u32 bt_bitcnt(struct backtrack_state *bt)
> > +{
> > +     return bt->bitcnt;
> > +}
>
> I could have missed it, but it doesn't look that any further patch uses
> the actual number of bits set.
> All uses are: if (bt_bitcnt(bt) != 0)
>
> Hence keeping bitcnt as extra 4 bytes and doing ++, -- on it
> seems wasteful.
> Maybe rename bt_bitcnt into bt_empty or bt_non_empty that
> will do !!bt->reg_masks[bt->frame] | !!bt->stack_masks[bt->frame]

Yes, number of bits doesn't matter, it's whether there are any or not.
So I'll rename.

As for the counter. I did it to avoid going over all MAX_CALL_FRAMES
frames to calculate the final mask. So it was a choice of maintaining
count proactively, or doing a loop each time we need to know if there
are any bits set:

u64 mask = 0;

for (i = 0; i <= bt->frame; i++)
    mask |= bt->reg_masks[i] | bt->stack_masks[i];

return mask != 0;


I don't think this one is very expensive either, so I can just switch
to this, if you prefer that. That will eliminate all the ifs, of
course.


>
>
> > +static inline int bt_subprog_enter(struct backtrack_state *bt)
> > +{
> > +     if (bt->frame == MAX_CALL_FRAMES - 1) {
> > +             verbose(bt->env, "BUG subprog enter from frame %d\n", bt->frame);
> > +             WARN_ONCE(1, "verifier backtracking bug");
> > +             return -EFAULT;
> > +     }
> > +     bt->frame++;
> > +     return 0;
> > +}
> > +
> > +static inline int bt_subprog_exit(struct backtrack_state *bt)
> > +{
> > +     if (bt->frame == 0) {
> > +             verbose(bt->env, "BUG subprog exit from frame 0\n");
> > +             WARN_ONCE(1, "verifier backtracking bug");
> > +             return -EFAULT;
> > +     }
> > +     bt->frame--;
> > +     return 0;
> > +}
> > +
> > +static inline void bt_set_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg)
> > +{
> > +     if (bt->reg_masks[frame] & (1 << reg))
> > +             return;
> > +
> > +     bt->reg_masks[frame] |= 1 << reg;
> > +     bt->bitcnt++;
> > +}
>
> It doesnt' look that any further patch is using bt_set_frame_reg with explicit frame.
> If not, collapse bt_set_frame_reg and bt_set_reg ?

We do later on, in patch #8, e.g., to propagate r1-r5 bits from
subprog back to caller.

>
> > +
> > +static inline void bt_clear_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg)
> > +{
> > +     if (!(bt->reg_masks[frame] & (1 << reg)))
> > +             return;
> > +
> > +     bt->reg_masks[frame] &= ~(1 << reg);
> > +     bt->bitcnt--;
> > +}
>
> If we remove ++,-- of bitcnt this function will be much shorter and faster:
> +static inline void bt_clear_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg)
> +{
> +       bt->reg_masks[frame] &= ~(1 << reg);
> +}
>
> Removing runtime conditional has a nice perf benefit. Obviously tiny in a grand scheme, but still.
>

Yep, see above. I did it to avoid doing a loop over up to 8 frame
levels to OR all reg_masks and stack_masks. Maybe a wrong tradeoff and
it's best to do one small loop when we need to check if there are any
bits left to clear?

> Overall it's a nice cleanup.

Thanks!


will fix missing empty line you mentioned in another reply




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux