On Thu, May 04, 2023 at 12:02:22PM -0700, Andrii Nakryiko wrote: > On Wed, May 3, 2023 at 7:53 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > Few early comments so far... > > > > On Tue, Apr 25, 2023 at 04:49:04PM -0700, Andrii Nakryiko wrote: > > > Add struct backtrack_state and straightforward API around it to keep > > > track of register and stack masks used and maintained during precision > > > backtracking process. Having this logic separately allow to keep > > > high-level backtracking algorithm cleaner, but also it sets us up to > > > cleanly keep track of register and stack masks per frame, allowing (with > > > some further logic adjustments) to perform precision backpropagation > > > across multiple frames (i.e., subprog calls). > > > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > --- > > > include/linux/bpf_verifier.h | 15 ++ > > > kernel/bpf/verifier.c | 258 ++++++++++++++++++++++++++--------- > > > 2 files changed, 206 insertions(+), 67 deletions(-) > > > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > > index 3dd29a53b711..185bfaf0ec6b 100644 > > > --- a/include/linux/bpf_verifier.h > > > +++ b/include/linux/bpf_verifier.h > > > @@ -238,6 +238,10 @@ enum bpf_stack_slot_type { > > > > > > #define BPF_REG_SIZE 8 /* size of eBPF register in bytes */ > > > > > > +#define BPF_REGMASK_ARGS ((1 << BPF_REG_1) | (1 << BPF_REG_2) | \ > > > + (1 << BPF_REG_3) | (1 << BPF_REG_4) | \ > > > + (1 << BPF_REG_5)) > > > + > > > #define BPF_DYNPTR_SIZE sizeof(struct bpf_dynptr_kern) > > > #define BPF_DYNPTR_NR_SLOTS (BPF_DYNPTR_SIZE / BPF_REG_SIZE) > > > > > > @@ -541,6 +545,16 @@ struct bpf_subprog_info { > > > bool is_async_cb; > > > }; > > > > > > +struct bpf_verifier_env; > > > + > > > +struct backtrack_state { > > > + struct bpf_verifier_env *env; > > > + u32 frame; > > > + u32 bitcnt; > > > + u32 reg_masks[MAX_CALL_FRAMES]; > > > + u64 stack_masks[MAX_CALL_FRAMES]; > > > +}; > > > + > > > /* single container for all structs > > > * one verifier_env per bpf_check() call > > > */ > > > @@ -578,6 +592,7 @@ struct bpf_verifier_env { > > > int *insn_stack; > > > int cur_stack; > > > } cfg; > > > + struct backtrack_state bt; > > > u32 pass_cnt; /* number of times do_check() was called */ > > > u32 subprog_cnt; > > > /* number of instructions analyzed by the verifier */ > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index fea6fe4acba2..1cb89fe00507 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -1254,6 +1254,12 @@ static bool is_spilled_reg(const struct bpf_stack_state *stack) > > > return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL; > > > } > > > > > > +static bool is_spilled_scalar_reg(const struct bpf_stack_state *stack) > > > +{ > > > + return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL && > > > + stack->spilled_ptr.type == SCALAR_VALUE; > > > +} > > > + > > > static void scrub_spilled_slot(u8 *stype) > > > { > > > if (*stype != STACK_INVALID) > > > @@ -3144,12 +3150,137 @@ static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn) > > > return btf_name_by_offset(desc_btf, func->name_off); > > > } > > > > > > +static inline void bt_init(struct backtrack_state *bt, u32 frame) > > > +{ > > > + bt->frame = frame; > > > +} > > > + > > > +static inline void bt_reset(struct backtrack_state *bt) > > > +{ > > > + struct bpf_verifier_env *env = bt->env; > > > + memset(bt, 0, sizeof(*bt)); > > > + bt->env = env; > > > +} > > > + > > > +static inline u32 bt_bitcnt(struct backtrack_state *bt) > > > +{ > > > + return bt->bitcnt; > > > +} > > > > I could have missed it, but it doesn't look that any further patch uses > > the actual number of bits set. > > All uses are: if (bt_bitcnt(bt) != 0) > > > > Hence keeping bitcnt as extra 4 bytes and doing ++, -- on it > > seems wasteful. > > Maybe rename bt_bitcnt into bt_empty or bt_non_empty that > > will do !!bt->reg_masks[bt->frame] | !!bt->stack_masks[bt->frame] > > Yes, number of bits doesn't matter, it's whether there are any or not. > So I'll rename. > > As for the counter. I did it to avoid going over all MAX_CALL_FRAMES > frames to calculate the final mask. So it was a choice of maintaining > count proactively, or doing a loop each time we need to know if there > are any bits set: > > u64 mask = 0; > > for (i = 0; i <= bt->frame; i++) > mask |= bt->reg_masks[i] | bt->stack_masks[i]; > > return mask != 0; > > > I don't think this one is very expensive either, so I can just switch > to this, if you prefer that. That will eliminate all the ifs, of > course. I see. Missed the loop over frames part. I guess I'm fine whichever way. Interesting trade off.