On Wed, May 3, 2023 at 7:53 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > Few early comments so far... > > On Tue, Apr 25, 2023 at 04:49:04PM -0700, Andrii Nakryiko wrote: > > Add struct backtrack_state and straightforward API around it to keep > > track of register and stack masks used and maintained during precision > > backtracking process. Having this logic separately allow to keep > > high-level backtracking algorithm cleaner, but also it sets us up to > > cleanly keep track of register and stack masks per frame, allowing (with > > some further logic adjustments) to perform precision backpropagation > > across multiple frames (i.e., subprog calls). > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > --- > > include/linux/bpf_verifier.h | 15 ++ > > kernel/bpf/verifier.c | 258 ++++++++++++++++++++++++++--------- > > 2 files changed, 206 insertions(+), 67 deletions(-) > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > index 3dd29a53b711..185bfaf0ec6b 100644 > > --- a/include/linux/bpf_verifier.h > > +++ b/include/linux/bpf_verifier.h > > @@ -238,6 +238,10 @@ enum bpf_stack_slot_type { > > > > #define BPF_REG_SIZE 8 /* size of eBPF register in bytes */ > > > > +#define BPF_REGMASK_ARGS ((1 << BPF_REG_1) | (1 << BPF_REG_2) | \ > > + (1 << BPF_REG_3) | (1 << BPF_REG_4) | \ > > + (1 << BPF_REG_5)) > > + > > #define BPF_DYNPTR_SIZE sizeof(struct bpf_dynptr_kern) > > #define BPF_DYNPTR_NR_SLOTS (BPF_DYNPTR_SIZE / BPF_REG_SIZE) > > > > @@ -541,6 +545,16 @@ struct bpf_subprog_info { > > bool is_async_cb; > > }; > > > > +struct bpf_verifier_env; > > + > > +struct backtrack_state { > > + struct bpf_verifier_env *env; > > + u32 frame; > > + u32 bitcnt; > > + u32 reg_masks[MAX_CALL_FRAMES]; > > + u64 stack_masks[MAX_CALL_FRAMES]; > > +}; > > + > > /* single container for all structs > > * one verifier_env per bpf_check() call > > */ > > @@ -578,6 +592,7 @@ struct bpf_verifier_env { > > int *insn_stack; > > int cur_stack; > > } cfg; > > + struct backtrack_state bt; > > u32 pass_cnt; /* number of times do_check() was called */ > > u32 subprog_cnt; > > /* number of instructions analyzed by the verifier */ > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index fea6fe4acba2..1cb89fe00507 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -1254,6 +1254,12 @@ static bool is_spilled_reg(const struct bpf_stack_state *stack) > > return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL; > > } > > > > +static bool is_spilled_scalar_reg(const struct bpf_stack_state *stack) > > +{ > > + return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL && > > + stack->spilled_ptr.type == SCALAR_VALUE; > > +} > > + > > static void scrub_spilled_slot(u8 *stype) > > { > > if (*stype != STACK_INVALID) > > @@ -3144,12 +3150,137 @@ static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn) > > return btf_name_by_offset(desc_btf, func->name_off); > > } > > > > +static inline void bt_init(struct backtrack_state *bt, u32 frame) > > +{ > > + bt->frame = frame; > > +} > > + > > +static inline void bt_reset(struct backtrack_state *bt) > > +{ > > + struct bpf_verifier_env *env = bt->env; > > + memset(bt, 0, sizeof(*bt)); > > + bt->env = env; > > +} > > + > > +static inline u32 bt_bitcnt(struct backtrack_state *bt) > > +{ > > + return bt->bitcnt; > > +} > > I could have missed it, but it doesn't look that any further patch uses > the actual number of bits set. > All uses are: if (bt_bitcnt(bt) != 0) > > Hence keeping bitcnt as extra 4 bytes and doing ++, -- on it > seems wasteful. > Maybe rename bt_bitcnt into bt_empty or bt_non_empty that > will do !!bt->reg_masks[bt->frame] | !!bt->stack_masks[bt->frame] Yes, number of bits doesn't matter, it's whether there are any or not. So I'll rename. As for the counter. I did it to avoid going over all MAX_CALL_FRAMES frames to calculate the final mask. So it was a choice of maintaining count proactively, or doing a loop each time we need to know if there are any bits set: u64 mask = 0; for (i = 0; i <= bt->frame; i++) mask |= bt->reg_masks[i] | bt->stack_masks[i]; return mask != 0; I don't think this one is very expensive either, so I can just switch to this, if you prefer that. That will eliminate all the ifs, of course. > > > > +static inline int bt_subprog_enter(struct backtrack_state *bt) > > +{ > > + if (bt->frame == MAX_CALL_FRAMES - 1) { > > + verbose(bt->env, "BUG subprog enter from frame %d\n", bt->frame); > > + WARN_ONCE(1, "verifier backtracking bug"); > > + return -EFAULT; > > + } > > + bt->frame++; > > + return 0; > > +} > > + > > +static inline int bt_subprog_exit(struct backtrack_state *bt) > > +{ > > + if (bt->frame == 0) { > > + verbose(bt->env, "BUG subprog exit from frame 0\n"); > > + WARN_ONCE(1, "verifier backtracking bug"); > > + return -EFAULT; > > + } > > + bt->frame--; > > + return 0; > > +} > > + > > +static inline void bt_set_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg) > > +{ > > + if (bt->reg_masks[frame] & (1 << reg)) > > + return; > > + > > + bt->reg_masks[frame] |= 1 << reg; > > + bt->bitcnt++; > > +} > > It doesnt' look that any further patch is using bt_set_frame_reg with explicit frame. > If not, collapse bt_set_frame_reg and bt_set_reg ? We do later on, in patch #8, e.g., to propagate r1-r5 bits from subprog back to caller. > > > + > > +static inline void bt_clear_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg) > > +{ > > + if (!(bt->reg_masks[frame] & (1 << reg))) > > + return; > > + > > + bt->reg_masks[frame] &= ~(1 << reg); > > + bt->bitcnt--; > > +} > > If we remove ++,-- of bitcnt this function will be much shorter and faster: > +static inline void bt_clear_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg) > +{ > + bt->reg_masks[frame] &= ~(1 << reg); > +} > > Removing runtime conditional has a nice perf benefit. Obviously tiny in a grand scheme, but still. > Yep, see above. I did it to avoid doing a loop over up to 8 frame levels to OR all reg_masks and stack_masks. Maybe a wrong tradeoff and it's best to do one small loop when we need to check if there are any bits left to clear? > Overall it's a nice cleanup. Thanks! will fix missing empty line you mentioned in another reply