Re: [PATCH bpf-next v3 3/4] bpf: verify scalar ids mapping in regsafe() using check_ids()

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 8 Jun 2023 15:37:36 -0700

On Thu, Jun 8, 2023 at 1:58 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Thu, 2023-06-08 at 22:05 +0300, Eduard Zingerman wrote:
> [...]
> > > Hm.. It's clever and pretty minimal, I like it. We are basically
> > > allocating virtual ID for SCALAR that doesn't have id, just to make
> > > sure we get a conflict if the SCALAR with ID cannot be mapped into two
> > > different SCALARs, right?
> > >
> > > The only question would be if it's safe to do that for case when
> > > old_reg->id != 0 and cur_reg->id == 0? E.g., if in old (verified)
> > > state we have r6.id = r7.id = 123, and in new state we have r6.id = 0
> > > and r7.id = 0, then your implementation above will say that states are
> > > equivalent. But are they, given there is a link between r6 and r7 that
> > > might be required for correctness. Which we don't have in current
> > > state.
> >
> > You mean the other way around, rold.id == 0, rcur.id != 0, right?
> > (below 0/2 means: original value 0, replaced by new id 2).

no, I actually meant what I wrote, but I didn't realize that
check_ids() is kind of broken... Because it shouldn't allow the same
ID from cur state to be mapped to two different IDs in old state,
should it?

> >
> > (1)   old cur
> > r6.id 0/2   1
> > r7.id 0/3   1 check_ids returns true

I think this should be rejected.

> >
> > (2)   old cur
> > r6.id 1   0/2
> > r7.id 1   0/3 check_ids returns false

And this should be rejected.

> >
> > Also, (1) is no different from (3):
> >
> > (3)   old cur
> > r6.id 1     3
> > r7.id 2     3 check_ids returns true

And this definitely should be rejected.

The only situation that might not be rejected would be:

        old    cur
r6.id   0/1    3
r7.id.  0/2    4

And perhaps this one is ok as well?

        old    cur
r6.id   3      0/1
r7.id.  4      0/2

And my assumption was that that's what you are trying to do. But
weirdly check_ids() is enforcing only that old ID has to have a unique
mapping, which seems like a bug.

> >
> > Current check:
> >
> >               if (!rold->precise)
> >                       return true;
> >               return range_within(rold, rcur) &&
> >                      tnum_in(rold->var_off, rcur->var_off) &&
> >                      check_ids(rold->id, rcur->id, idmap);
> >
> > Will reject (1) and (2), but will accept (3).
> >
> > New check:
> >
> >               if (!rold->precise)
> >                       return true;
> >               return range_within(rold, rcur) &&
> >                      tnum_in(rold->var_off, rcur->var_off) &&
> >                      check_scalar_ids(rold->id, rcur->id, idmap);
> >
> > Will reject (2), but will accept (1) and (3).
> >
> > And modification of check_scalar_ids() to generate IDs only for rold
> > or only for rcur would not reject (3) either.
> >
> > (3) would be rejected only if current check is used together with
> > elimination of unique scalar IDs from old states.
> >
> > My previous experiments show that eliminating unique IDs from old
> > states and not eliminating unique IDs from cur states is *very* bad
> > for performance.
> >
> > >
> > > So with this we are getting to my original concerns with your
> > > !rold->id approach, which tries to ignore the necessity of link
> > > between registers.
> > >
> > > What if we do this only for old registers? Then, (in old state) r6.id
> > > = 0, r7.id = 0, (in new state) r6.id = r7.id = 123. This will be
> > > rejected because first we'll map 123 to newly allocated X for r6.id,
> > > and then when we try to match r7.id=123 to another allocated ID X+1
> > > we'll get a conflict, right?
>
> [...]
>
> Ok, here is what I think is the final version:
> a. for each old or cur ID generate temporary unique ID;
> b. for scalars use modified check_ids that forbids mapping same 'cur'
>    ID to several different 'old' IDs.
>
> (a) allows to reject situations like:
>
>   (1)   old cur   (2)   old cur
>   r6.id 0   1     r6.id 1   0
>   r7.id 0   1     r7.id 1   0
>
> (b) allows to reject situations like:
>
>   (3)   old cur
>   r6.id 1   3
>   r7.id 2   3
>
> And whether some scalar ID is unique or not does not matter for the
> algorithm.
>
> Tests are passing, katran example is fine (350k insns, 29K states),
> minor veristat regression:
>
> File       Program                         States (DIFF)
> ---------  ------------------------------  -------------
> bpf_xdp.o  tail_nodeport_nat_ingress_ipv4    +3 (+0.80%)
> bpf_xdp.o  tail_rev_nodeport_lb4             +2 (+0.50%)
>
> --- 8< -----------------------------
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 235d7eded565..5794dc7830db 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -15149,6 +15149,31 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap)
>         return false;
>  }
>
> +static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id,
> +                            struct bpf_id_pair *idmap)
> +{
> +       unsigned int i;
> +
> +       old_id = old_id ? old_id : env->id_gen++;
> +       cur_id = cur_id ? cur_id : env->id_gen++;
> +
> +       for (i = 0; i < BPF_ID_MAP_SIZE; i++) {
> +               if (!idmap[i].old) {
> +                       /* Reached an empty slot; haven't seen this id before */
> +                       idmap[i].old = old_id;
> +                       idmap[i].cur = cur_id;
> +                       return true;
> +               }
> +               if (idmap[i].old == old_id)
> +                       return idmap[i].cur == cur_id;
> +               if (idmap[i].cur == cur_id)
> +                       return false;

I think this should just be added to existing check_ids(), I think
it's a bug that we don't check this condition today in check_ids().

But I'd say let's land fixes you have right now. And then work on
fixing and optimizing scala ID checks separately. We are doing too
many things at the same time :(

> +       }
> +       /* We ran out of idmap slots, which should be impossible */
> +       WARN_ON_ONCE(1);
> +       return false;
> +}
> +
>  static void clean_func_state(struct bpf_verifier_env *env,
>                              struct bpf_func_state *st)
>  {
> @@ -15325,7 +15350,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>                  */
>                 return range_within(rold, rcur) &&
>                        tnum_in(rold->var_off, rcur->var_off) &&
> -                      check_ids(rold->id, rcur->id, idmap);
> +                      check_scalar_ids(env, rold->id, rcur->id, idmap);
>         case PTR_TO_MAP_KEY:
>         case PTR_TO_MAP_VALUE:
>         case PTR_TO_MEM:
>
> ----------------------------- >8 ---