Re: [PATCH bpf-next v3 3/4] bpf: verify scalar ids mapping in regsafe() using check_ids()

Eduard Zingerman <eddyz87@xxxxxxxxx> · Thu, 08 Jun 2023 23:58:38 +0300

On Thu, 2023-06-08 at 22:05 +0300, Eduard Zingerman wrote:
[...]
> > Hm.. It's clever and pretty minimal, I like it. We are basically
> > allocating virtual ID for SCALAR that doesn't have id, just to make
> > sure we get a conflict if the SCALAR with ID cannot be mapped into two
> > different SCALARs, right?
> > 
> > The only question would be if it's safe to do that for case when
> > old_reg->id != 0 and cur_reg->id == 0? E.g., if in old (verified)
> > state we have r6.id = r7.id = 123, and in new state we have r6.id = 0
> > and r7.id = 0, then your implementation above will say that states are
> > equivalent. But are they, given there is a link between r6 and r7 that
> > might be required for correctness. Which we don't have in current
> > state.
> 
> You mean the other way around, rold.id == 0, rcur.id != 0, right?
> (below 0/2 means: original value 0, replaced by new id 2).
> 
> (1)   old cur
> r6.id 0/2   1
> r7.id 0/3   1 check_ids returns true
> 
> (2)   old cur
> r6.id 1   0/2
> r7.id 1   0/3 check_ids returns false
> 
> Also, (1) is no different from (3):
> 
> (3)   old cur
> r6.id 1     3
> r7.id 2     3 check_ids returns true
> 
> Current check:
> 
> 		if (!rold->precise)
> 			return true;
> 		return range_within(rold, rcur) &&
> 		       tnum_in(rold->var_off, rcur->var_off) &&
> 		       check_ids(rold->id, rcur->id, idmap);
> 
> Will reject (1) and (2), but will accept (3).
> 
> New check:
> 
> 		if (!rold->precise)
> 			return true;
> 		return range_within(rold, rcur) &&
> 		       tnum_in(rold->var_off, rcur->var_off) &&
> 		       check_scalar_ids(rold->id, rcur->id, idmap);
> 
> Will reject (2), but will accept (1) and (3).
> 
> And modification of check_scalar_ids() to generate IDs only for rold
> or only for rcur would not reject (3) either.
> 
> (3) would be rejected only if current check is used together with
> elimination of unique scalar IDs from old states.
> 
> My previous experiments show that eliminating unique IDs from old
> states and not eliminating unique IDs from cur states is *very* bad
> for performance.
> 
> > 
> > So with this we are getting to my original concerns with your
> > !rold->id approach, which tries to ignore the necessity of link
> > between registers.
> > 
> > What if we do this only for old registers? Then, (in old state) r6.id
> > = 0, r7.id = 0, (in new state) r6.id = r7.id = 123. This will be
> > rejected because first we'll map 123 to newly allocated X for r6.id,
> > and then when we try to match r7.id=123 to another allocated ID X+1
> > we'll get a conflict, right?

[...]

Ok, here is what I think is the final version:
a. for each old or cur ID generate temporary unique ID;
b. for scalars use modified check_ids that forbids mapping same 'cur'
   ID to several different 'old' IDs.

(a) allows to reject situations like:

  (1)   old cur   (2)   old cur 
  r6.id 0   1     r6.id 1   0
  r7.id 0   1     r7.id 1   0

(b) allows to reject situations like:

  (3)   old cur
  r6.id 1   3
  r7.id 2   3

And whether some scalar ID is unique or not does not matter for the
algorithm.

Tests are passing, katran example is fine (350k insns, 29K states),
minor veristat regression:

File       Program                         States (DIFF)
---------  ------------------------------  -------------
bpf_xdp.o  tail_nodeport_nat_ingress_ipv4    +3 (+0.80%)
bpf_xdp.o  tail_rev_nodeport_lb4             +2 (+0.50%)

--- 8< -----------------------------

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 235d7eded565..5794dc7830db 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -15149,6 +15149,31 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap)
        return false;
 }
 
+static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id,
+                            struct bpf_id_pair *idmap)
+{
+       unsigned int i;
+
+       old_id = old_id ? old_id : env->id_gen++;
+       cur_id = cur_id ? cur_id : env->id_gen++;
+
+       for (i = 0; i < BPF_ID_MAP_SIZE; i++) {
+               if (!idmap[i].old) {
+                       /* Reached an empty slot; haven't seen this id before */
+                       idmap[i].old = old_id;
+                       idmap[i].cur = cur_id;
+                       return true;
+               }
+               if (idmap[i].old == old_id)
+                       return idmap[i].cur == cur_id;
+               if (idmap[i].cur == cur_id)
+                       return false;
+       }
+       /* We ran out of idmap slots, which should be impossible */
+       WARN_ON_ONCE(1);
+       return false;
+}
+
 static void clean_func_state(struct bpf_verifier_env *env,
                             struct bpf_func_state *st)
 {
@@ -15325,7 +15350,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
                 */
                return range_within(rold, rcur) &&
                       tnum_in(rold->var_off, rcur->var_off) &&
-                      check_ids(rold->id, rcur->id, idmap);
+                      check_scalar_ids(env, rold->id, rcur->id, idmap);
        case PTR_TO_MAP_KEY:
        case PTR_TO_MAP_VALUE:
        case PTR_TO_MEM:

----------------------------- >8 ---