Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()

Eduard Zingerman <eddyz87@xxxxxxxxx> · Mon, 05 Jun 2023 23:43:53 +0300

On Sat, 2023-06-03 at 02:20 +0300, Eduard Zingerman wrote:
> On Fri, 2023-06-02 at 15:07 -0700, Andrii Nakryiko wrote:
> [...]
> > > And we don't really need to bother about unique IDs in non-cached state
> > > when rold->id check discussed in a sibling thread is used:
> > > 
> > >                 if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
> > >                         return false;
> > 
> > It makes me worry that we are mixing no ID and ID-ed SCALARs and
> > making them equivalent. I need to think some more about implications
> > (and re-read what you and Alexei discussed). I don't feel good about
> > this and suspect we'll miss some non-obvious corner case if we do
> > this.
> 
> Here is the complete argument in a single piece:
> 
>   Suppose:
>     - There is a checkpoint state 'Old' with two registers marked as:
>       - rA=precise scalar with range A, id=0;
>       - rB=precise scalar with range B, id=0.
>     - 'Old' is proven to be safe.
>     - There is a new state 'Cur' with two registers marked as:
>       - rA=scalar with range C, id=U;
>       - rB=scalar with range C, id=U;
>         (Note: if rA.id == rB.id the registers have identical range).
> 
>   Proposition:
>     As long as range C is a subset of both range A and range B
>     the state 'Cur' is safe.
> 
>   Proof:
>     State 'Cur' represents a special case of state 'Old',
>     a variant of 'Old' where rA and rB happen to have same value.
>     Thus 'Cur' is safe because 'Old' is safe.
> 

I had a separate discussion with Andrii on this topic.
Andrii is still concerned that the change from:

	if (rold->precise && !check_ids(idmap, rold, rcur))
		return false;
to:

	if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
		return false;

Is not fully equivalent and there might be a corner case where 'Cur'
and 'Old' don't behave in the same way. I don't have a better argument
than the one stated above.

On the other hand, this optimization is needed to mitigate relatively
small overhead:

$ ./veristat -e file,prog,states -f "states_pct>10" \
    -C master-baseline.log current.log
File         Program                         States  (DIFF)
-----------  ------------------------------  --------------
bpf_xdp.o    tail_handle_nat_fwd_ipv6        +155 (+23.92%)
bpf_xdp.o    tail_nodeport_nat_ingress_ipv4  +102 (+27.20%)
bpf_xdp.o    tail_rev_nodeport_lb4            +83 (+20.85%)
loop6.bpf.o  trace_virtqueue_add_sgs          +25 (+11.06%)

(there are also a few programs with overhead <1%, I don't list those
 for brevity).

On yet another hand:
- this change major impact is because of elimination of unique IDs in
  the 'Cur' state;
- once register spill/fill code is modified to generate scalar IDs in
  the same way as BPF_MOV code does it would be possible to drop
  unique scalar IDs in 'Cur' and in 'Old' in is_state_visited() /
  clean_live_states() using the same code.

Having this in mind we decided to withhold this change for now.
I will post an updated patch-set without it, Andrii will test it on
some more internal BPF programs to see if impact is still small.

> > > 
> > > Here, if rcur->id is unique there are two cases:
> > > - rold->id == 0: then rcur->id is just ignored
> > > - rold->id != 0: then rold->id/rcur->id pair would be added to idmap,
> > >                  there is some other precise old register with the
> > >                  same id as rold->id, so eventually check_ids()
> > >                  would make regsafe() return false.
> > > 
> > > > > - do a check as follows:
> > > > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > > >     return false;
> > > >