On Tue, 2024-07-09 at 17:34 -0700, Andrii Nakryiko wrote: > On Fri, Jul 5, 2024 at 1:59 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > Use bpf_verifier_state->jmp_history to track which registers were > > updated by find_equal_scalars() when conditional jump was verified. > > Use recorded information in backtrack_insn() to propagate precision. > > > > E.g. for the following program: > > > > while verifying instructions > > r1 = r0 | > > if r1 < 8 goto ... | push r0,r1 as equal_scalars in jmp_history > > if r0 > 16 goto ... | push r0,r1 as equal_scalars in jmp_history > > linked_scalars? especially now that Alexei added offsets between > linked registers Missed this, will update. > > > r2 = r10 | > > r2 += r0 v mark_chain_precision(r0) > > > > while doing mark_chain_precision(r0) > > r1 = r0 ^ > > if r1 < 8 goto ... | mark r0,r1 as precise > > if r0 > 16 goto ... | mark r0,r1 as precise > > r2 = r10 | > > r2 += r0 | mark r0 precise > > let's reverse the order here so it's linear in how the algorithm > actually works (backwards)? I thought the arrow would be enough. Ok, can reverse. > > Technically, achieve this as follows: > > - Use 10 bits to identify each register that gains range because of > > find_equal_scalars(): > > should this be renamed to find_linked_scalars() nowadays? That would be sync_linked_regs() if we use naming that you suggest. Will update. [...] > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index e25ad5fb9115..ec493360607e 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -3335,9 +3335,87 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx) > > return env->insn_aux_data[insn_idx].jmp_point; > > } > > > > +#define ES_FRAMENO_BITS 3 > > +#define ES_SPI_BITS 6 > > +#define ES_ENTRY_BITS (ES_SPI_BITS + ES_FRAMENO_BITS + 1) > > +#define ES_SIZE_BITS 4 > > +#define ES_FRAMENO_MASK ((1ul << ES_FRAMENO_BITS) - 1) > > +#define ES_SPI_MASK ((1ul << ES_SPI_BITS) - 1) > > +#define ES_SIZE_MASK ((1ul << ES_SIZE_BITS) - 1) > > ull for 32-bit arches? Ok > > > +#define ES_SPI_OFF ES_FRAMENO_BITS > > +#define ES_IS_REG_OFF (ES_SPI_BITS + ES_FRAMENO_BITS) > > ES makes no sense now, no? LR or LINKREG or something along those lines? > > > +#define LINKED_REGS_MAX 6 > > + > > +struct reg_or_spill { > > reg_or_spill -> linked_reg ? Ok > > > + u8 frameno:3; > > + union { > > + u8 spi:6; > > + u8 regno:6; > > + }; > > + bool is_reg:1; > > +}; > > Do we need these bitfields for unpacked representation? It's going to > use 2 bytes for this struct anyways. If you just use u8 for everything > you end up with 3 bytes. Bitfields are a bit slower because the > compiler will need to do more bit manipulations, so is it really worth > it? Ok, will remove bitfields. [...] > > @@ -3844,6 +3974,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > > */ > > bt_set_reg(bt, dreg); > > bt_set_reg(bt, sreg); > > + } else if (BPF_SRC(insn->code) == BPF_K) { > > /* else dreg <cond> K > > drop "else" from the comment then? I like this change. This is actually a leftover from v1. I can drop "else" from the comment or drop this hunk as it is not necessary for the series. > > * Only dreg still needs precision before > > * this insn, so for the K-based conditional > > @@ -3862,6 +3993,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > > /* to be analyzed */ > > return -ENOTSUPP; > > } > > + /* Propagate precision marks to linked registers, to account for > > + * registers marked as precise in this function. > > + */ > > + bt_sync_linked_regs(bt, hist); > > Radical Andrii is fine with this, though I wonder if there is some > place outside of backtrack_insn() where the first > bt_sync_linked_regs() could be called just once? The problem here is that: - in theory linked_regs could be present for any instruction, thus sync() is needed after get_jmp_hist_entry call in __mark_chain_precision(); - backtrack_insn() might both remove and add some registers in bt, hence, to correctly handle bt_empty() call in __mark_chain_precision the sync() is also needed after backtrack_insn(). So, current placement is the simplest I could come up with. > But regardless, this is only mildly expensive when we do have linked > registers, so unlikely to have any noticeable performance effect. Yes, that was my thinking as well. [...] > > @@ -15154,14 +15289,66 @@ static bool try_match_pkt_pointers(const struct bpf_insn *insn, > > return true; > > } > > > > -static void find_equal_scalars(struct bpf_verifier_state *vstate, > > - struct bpf_reg_state *known_reg) > > +static void __find_equal_scalars(struct linked_regs *reg_set, struct bpf_reg_state *reg, > > + u32 id, u32 frameno, u32 spi_or_reg, bool is_reg) > > we should abandon "equal scalars" terminology, they don't have to be > equal, they are just linked together (potentially with a fixed > difference between them) > > how about "collect_linked_regs"? Sounds good. [...] > > @@ -15312,6 +15500,21 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > > return 0; > > } > > > > + /* Push scalar registers sharing same ID to jump history, > > + * do this before creating 'other_branch', so that both > > + * 'this_branch' and 'other_branch' share this history > > + * if parent state is created. > > + */ > > + if (BPF_SRC(insn->code) == BPF_X && src_reg->type == SCALAR_VALUE && src_reg->id) > > + find_equal_scalars(this_branch, src_reg->id, &linked_regs); > > + if (dst_reg->type == SCALAR_VALUE && dst_reg->id) > > + find_equal_scalars(this_branch, dst_reg->id, &linked_regs); > > + if (linked_regs.cnt > 1) { > > if we have just one, should it be even marked as linked? Sorry, I don't understand. Do you suggest to add an additional check in find_equal_scalars/collect_linked_regs and reset it if 'cnt' equals 1? [...] > > > + err = push_jmp_history(env, this_branch, 0, linked_regs_pack(&linked_regs)); > > + if (err) > > + return err; > > + } > > + > > other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx, > > false); > > if (!other_branch) > > @@ -15336,13 +15539,13 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > > if (BPF_SRC(insn->code) == BPF_X && > > src_reg->type == SCALAR_VALUE && src_reg->id && > > !WARN_ON_ONCE(src_reg->id != other_branch_regs[insn->src_reg].id)) { > > - find_equal_scalars(this_branch, src_reg); > > - find_equal_scalars(other_branch, &other_branch_regs[insn->src_reg]); > > + copy_known_reg(this_branch, src_reg, &linked_regs); > > + copy_known_reg(other_branch, &other_branch_regs[insn->src_reg], &linked_regs); > > I liked the "sync" terminology you used for bt, so why not call this > "sync_linked_regs" ? I kept the current name for the function. Suggested name makes sense, though. [...]