Re: [PATCH bpf-next v2 1/3] bpf: track find_equal_scalars history on per-instruction level

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Wed, 10 Jul 2024 08:21:56 -0700

On Tue, Jul 9, 2024 at 11:36 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Tue, 2024-07-09 at 22:28 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > > >   r2 = r10             |
> > > > >   r2 += r0             v mark_chain_precision(r0)
> > > > >
> > > > >             while doing mark_chain_precision(r0)
> > > > >   r1 = r0              ^
> > > > >   if r1 < 8  goto ...  | mark r0,r1 as precise
> > > > >   if r0 > 16 goto ...  | mark r0,r1 as precise
> > > > >   r2 = r10             |
> > > > >   r2 += r0             | mark r0 precise
> > > >
> > > > let's reverse the order here so it's linear in how the algorithm
> > > > actually works (backwards)?
> > >
> > > I thought the arrow would be enough. Ok, can reverse.
> >
> > it's the reverse order compared to what you'd see in the verifier log.
> > I did see the arrow (though it wasn't all that clear on the first
> > reading), but still feels like it would be better to have consistent
> > order with verifier log
>
> Ok, no problem
>
> >
> > [...]
> >
> > > > > @@ -3844,6 +3974,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
> > > > >                          */
> > > > >                         bt_set_reg(bt, dreg);
> > > > >                         bt_set_reg(bt, sreg);
> > > > > +               } else if (BPF_SRC(insn->code) == BPF_K) {
> > > > >                          /* else dreg <cond> K
> > > >
> > > > drop "else" from the comment then? I like this change.
> > >
> > > This is actually a leftover from v1. I can drop "else" from the
> > > comment or drop this hunk as it is not necessary for the series.
> >
> > I'd keep explicit `else if`
>
> Ok, will do
>
> [...]
>
> > > > > @@ -15312,6 +15500,21 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
> > > > >                 return 0;
> > > > >         }
> > > > >
> > > > > +       /* Push scalar registers sharing same ID to jump history,
> > > > > +        * do this before creating 'other_branch', so that both
> > > > > +        * 'this_branch' and 'other_branch' share this history
> > > > > +        * if parent state is created.
> > > > > +        */
> > > > > +       if (BPF_SRC(insn->code) == BPF_X && src_reg->type == SCALAR_VALUE && src_reg->id)
> > > > > +               find_equal_scalars(this_branch, src_reg->id, &linked_regs);
> > > > > +       if (dst_reg->type == SCALAR_VALUE && dst_reg->id)
> > > > > +               find_equal_scalars(this_branch, dst_reg->id, &linked_regs);
> > > > > +       if (linked_regs.cnt > 1) {
> > > >
> > > > if we have just one, should it be even marked as linked?
> > >
> > > Sorry, I don't understand. Do you suggest to add an additional check
> > > in find_equal_scalars/collect_linked_regs and reset it if 'cnt' equals 1?
> >
> > I find `if (linked_regs.cnt > 1)` check a bit weird and it feels like
> > it should be unnecessary. As soon as we are left with just one
> > "linked" register (linked with what? with itself?) it shouldn't be
> > linked anymore. Is there a point where we break the link between
> > registers where we can/should drop ID from the singularly linked
> > register? Why keep that scalar register ID set?
>
> I can push this check inside find_equal_scalars/collect_linked_regs, e.g.:
>
> collect_linked_regs(... linked_regs ...)
> {
>         ...
>         if (linked_regs.cnt == 1)
>                 linked_regs.cnt = 0;

I mean, fine, that's ok. But you are missing the point I'm making. I'm
saying there is somewhere in the verifier (and I'm too lazy/don't care
to go find where) where we break linked registers link (we reset ID on
one of them, probably). What I am asking is whether we should have a
check there to also reset ID on the last remaining
"kind-of-linked-but-not-really-anymore" register.

Anyways, this doesn't have to be solved right away, so let's do this
fixup you are proposing here and keep clean "linked_regs.cnt > 0"
check below.

>         ...
> }
>
> But then this particular place would have to be modified as follows:
>
>         if (linked_regs.cnt > 0) {

yes, this makes total sense ("are there any linked regs? if not, there
is nothing to push to history")

>                 err = push_jmp_history(env, this_branch, 0, linked_regs_pack(&linked_regs));
>                 if (err)
>                         return err;
>         }
>
> Or something similar has to be done inside push_jmp_history().

no need to push this inside push_jmp_history(), why paying the price
of linked_regs_pack() unnecessarily?

>
> [...]