Re: [PATCH bpf-next v2 1/4] bpf: verify scalar ids mapping in regsafe() using check_ids()

Eduard Zingerman <eddyz87@xxxxxxxxx> · Sat, 03 Jun 2023 02:20:19 +0300

On Fri, 2023-06-02 at 15:07 -0700, Andrii Nakryiko wrote:
[...]
> > > > I'm not sure I understand, could you please describe how it should
> > > > work for e.g.?:
> > > > 
> > > >     3: r6 &= 0xf            // assume safe bound
> > > >     4: if (r6 > r7) goto +1
> > > >     5: r7 = r6
> > > >     --- checkpoint #1 ---
> > > >     6: r7 = 0
> > > >     7: if (r7 > 10) goto exit;
> > > >     8: r7 = 0
> > > >     9: r9 += r6
> > > > 
> > > > __mark_chain_precision() would get to checkpoint #1 with only r6 as
> > > > precise, what should happen next?
> > > 
> > > it should mark all SCALARs that have r6's ID in env->bt, and then
> > > proceed with precision propagation until next parent state? This is
> > > where you'll mark r7, because in parent state (checkpoint #1) r6.id ==
> > > r7.id.
> > 
> > That's what I do now.
> 
> Ok, cool...
> 
> > Sorry, I thought you had a suggestion on how to avoid the precise set
> > overestimation (e.g. how to detect that "6: r7 = 0" breaks the link).
> 
> ..but "overestimation" confuses me. There is no overestimation. After
> checkpoint #1 (let's say we have checkpoint #2 after instruction 9),
> r6 and r7 are not linked, and if we had to mark r6 as precise, we'd
> mark only r7. But as of checkpoint #1, r7.id == r6.id and they are
> linked. So there is no overestimation. They are linked together as of
> checkpoint #1.

Sorry, I should have adjusted the example a bit more, consider:

                                   precise registers
     1: r6 = scalar of range X
     2: r7 = unbound scalar
     3: r8 = unbound scalar
     4: r9 = stack of size X
     5: if (r6 > r7) goto +1
     6: r7 = r6
    --- checkpoint #1 ---          r6, r7 (because r6.id == r7.id in checkpoint #1)
     7: r7 = 0                     r6
     8: if (r7 > r8) goto exit     r6     (unpredictable jump => r{7,8} not precise)
     9: r7 = 0                     r6
    10: r9 += r6                   r6
    --- checkpoint #2 ---

Here there is no real need to mark r7 as precise, but current
algorithm would. Thus, we overestimate the set of precise registers.
(But it might be fine if workaround is complicated).

> But anyways, I think we are on the same page, even if we don't use the
> same words :)
> 
> > 
> > > It might be easier to just discuss latest code you have, there are
> > > lots of intricacies, and code wins over words :)
> > 
> > Here is what I have now:
> > https://github.com/kernel-patches/bpf/compare/bpf-next_base...eddyz87:bpf:verify-ids-for-scalars-in-regsafe-v3
> 
> feels a bit heavyweight to do sorting and stuff. I think in practice
> you'll have only very small amount of linked SCALAR IDs, so a simple
> linear search would be faster than all that sort+bsearch.
> 
> Look at check_ids(). It's called all the time and is a linear search.
> I think it's fine to keep thing simple here as well.

:(
Ok, I can keep it linear.
In ideal world `sort` / `bsearch` would be inlined and have a special
case for count < 16 (or whatever is the magic number on modern CPUs).

> > The interesting part is mark_precise_scalar_ids().
> > 
> > But a few tests are not passing because expected messages have to be adjusted.
> > And a lot of tests have to be added.
> > We can delay discussion until I submit v3 (worst case tomorrow).
> > 
> > > > As a side note: I added several optimizations:
> > > > - avoid allocation of scalar ids for constants;
> > > > - remove sole scalar ids from cached states;
> > > 
> > > so that's what I was proposing earlier,
> > 
> > Yes, it turned out beneficial when I inspected logs for bpf_xdp.o.
> > 
> > > but not just from cached
> > > states, but from any state. As soon as we get SCALAR with some ID that
> > > is not shared by any other SCALAR, we should try to drop that ID. The
> > > question is only in how to implement this efficiently.
> > 
> > No, we don't want to do it for non-cached state, not until we generate
> > scalar ids on stack spills and fills. Otherwise we would break
> > find_equal_scalars() for the following case:
> > 
> >   r1 = r2         // r1 gains ID
> >   fp[-8] = r1     //
> >   r2 = 0          //
> >   r1 = 0          // fp[-8] has a unique ID now
> 
> exactly, as of right now there is no linked registers anymore, so we
> can clear ID
> 
> 
> >   --- checkpoint ---
> >   r1 = fp[-8]
> 
> and this is where we should generate a new ID and assign it to r1.id
> and fp[-8].id.
> 
> How is this fundamentally different from just `r1 = r2`?

I agree that IDs should be issued on spills and fills.
However, this is not happening right now.
I have an experimental patch-set with id generation at spill,
and see some verification performance issues.
If possible, I'd prefer to complete these two tasks separately.

> >   r2 = fp[-8]
> 
> and now r2.id = r1.id = fp[-8].id (but it's different ID than as
> before checkpoint)
> 
> >   if r1 > 10 goto exit; // range propagates to r2 now,
> >                         // but won't propagate if fp[-8] ID
> >                         // is cleared at checkpoint
> > 
> > (A bit contrived, but conveys the idea)
> > 
> > And we don't really need to bother about unique IDs in non-cached state
> > when rold->id check discussed in a sibling thread is used:
> > 
> >                 if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
> >                         return false;
> 
> It makes me worry that we are mixing no ID and ID-ed SCALARs and
> making them equivalent. I need to think some more about implications
> (and re-read what you and Alexei discussed). I don't feel good about
> this and suspect we'll miss some non-obvious corner case if we do
> this.

Here is the complete argument in a single piece:

  Suppose:
    - There is a checkpoint state 'Old' with two registers marked as:
      - rA=precise scalar with range A, id=0;
      - rB=precise scalar with range B, id=0.
    - 'Old' is proven to be safe.
    - There is a new state 'Cur' with two registers marked as:
      - rA=scalar with range C, id=U;
      - rB=scalar with range C, id=U;
        (Note: if rA.id == rB.id the registers have identical range).

  Proposition:
    As long as range C is a subset of both range A and range B
    the state 'Cur' is safe.

  Proof:
    State 'Cur' represents a special case of state 'Old',
    a variant of 'Old' where rA and rB happen to have same value.
    Thus 'Cur' is safe because 'Old' is safe.

> > 
> > Here, if rcur->id is unique there are two cases:
> > - rold->id == 0: then rcur->id is just ignored
> > - rold->id != 0: then rold->id/rcur->id pair would be added to idmap,
> >                  there is some other precise old register with the
> >                  same id as rold->id, so eventually check_ids()
> >                  would make regsafe() return false.
> > 
> > > > - do a check as follows:
> > > >   if (rold->precise && rold->id && !check_ids(idmap, rold, rcur))
> > > >     return false;
> > > 
> > > Hm.. do we need extra special case here? With precision changes we are
> > > discussion, and this removing singular SCALAR IDs you are proposing,
> > > just extending existing logic to:
> > > 
> > >                 if (regs_exact(rold, rcur, idmap))
> > >                         return true;
> > >                 if (env->explore_alu_limits)
> > >                         return false;
> > >                 if (!rold->precise)
> > >                         return true;
> > >                 /* new val must satisfy old val knowledge */
> > >                 return range_within(rold, rcur) &&
> > >                        check_ids(rold->id, rcur->id, idmap) &&
> > >                        check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
> > >                        tnum_in(rold->var_off, rcur->var_off);
> > > 
> > > wouldn't be enough?
> > 
> > Yes, it could be shortened as below:
> > 
> >                  return range_within(rold, rcur) &&
> >                         (rold->id == 0 || check_ids(rold->id, rcur->id, idmap)) &&
> >                         check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap) &&
> >                         tnum_in(rold->var_off, rcur->var_off);
> > 
> > but I wanted a separate place to put a long comment at.
> 
> you can still put a big comment right here? At the very least I'd move
> this new condition to after `if (!rold->precise)`. Where you put it
> right now seems a bit out of place.

I'll move the condition after `if (!rold->precise)`.

> > > > And I'm seeing almost zero performance overhead now.
> > > > So, maybe what we figured so far is good enough.
> > > > Need to add more tests, though.
> >