Re: [PATCH v2] selftests/bpf: Update map_kptr examples to reflect real use-cases

David Vernet <void@xxxxxxxxxxxxx> · Tue, 4 Oct 2022 10:29:47 -0500

On Mon, Oct 03, 2022 at 06:22:55PM +0200, Kumar Kartikeya Dwivedi wrote:

[...]

> > > >  noinline void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p)
> > > >  {
> > > >         if (!p)
> > > >                 return;
> > > >
> > > > -       refcount_dec(&p->cnt);
> > > > +       WARN_ON_ONCE(atomic_read(&p->destroyed));
> > > > +       if (refcount_dec_and_test(&p->cnt))
> > > > +               call_rcu(&p->rcu, delayed_destroy_test_ref_struct);
> > >
> > > I wonder whether this is ever called, I haven't really given this
> > > patch a shot, but I don't see how the refcount can ever drop back to
> > > zero. It's initialized as 1, and then only acquired after that, so
> > > pairing all releases should still preserve refcount as 1.
> >
> > Yeah, the call_rcu() path is never hit. If we wanted to update the test
> > so that this codepath was actually exercised, I think we'd need to add
> > another kfunc that returned a reference to a dynamically allocated
> > object rather than using the global, static one. I'm happy to do that if
> > we think it's useful. The downside to adding more of these test kfuncs
> > is that they actually do add a small bit of runtime overhead to the
> > kernel because they're unconditionally registered in the __init function
> > for test_run.c.
> >
> 
> But that only happens once, right? It still happens, so I don't see
> what changes.

The idea here would be to return a dynamically allocated object with an
initial refcount of 1 that's owned by the _BPF program_, rather than
what we have today where the global struct has an initial refcount
that's owned by the main kernel and is never expected to go to 0. For
all success (i.e. non-fail) testcases that are able to dynamically
allocate this object, the refcount should go to 0 for each of them and
the object will be destroyed after the current RCU grace period. Please
let me know if I've misunderstood your point.

> > > Also, even if you made it work, wouldn't you have the warning once you
> > > run more selftests using prog_test_run, if you just set the  destroyed
> > > bit on each test run?
> >
> > If we want to update the test to have the refcount drop to 0, we would
> > probably have to instead use dynamically allocated objects. At that
> > point, we'd probably just crash instead of seeing a warning if we
> > accidentally let a caller invoke acquire or release after the object had
> > been destroyed. Maybe the better thing to do here is to just warn
> > unconditionally in the destructor rather than setting a flag? What we
> > really want to ensure is that the final refcount that's "owned" by the
> > main kernel is never dropped.
> 
> I think the refcount_t API already warns if underflow happens.

Right, a warning would probably show up if we did a release that caused
an underflow, but it would not for an acquire after the refcount dropped
to 0.

> To be clear, I don't mind if you want to improve this, it's certainly
> a mess right now. Tests can't even run in parallel easily because it's
> global. Testing like an actually allocated object might be a good way
> to simulate a real scenario. And I totally agree that having a real
> example is useful to people who want to add support for more kptrs.

Ok, let me update the tests to do two things then:

1. Add a new test kfunc called bpf_kfunc_call_test_alloc() which returns
   a dynamically allocated instance of a prog_test_ref_kfunc *. This is
   similar in intention to bpf_xdp_ct_alloc().
2. Update bpf_kfunc_call_test_acquire() and
   bpf_kfunc_call_test_release() to take a trusted pointer to that
   allocated prog_test_ref_kfunc *.
3. Keep the other changes which e.g. use RCU in
   bpf_kfunc_call_test_kptr_get() to synchronize on getting the kptr.
   Once the __rcu kptr variant is landed we can get rid of
   bpf_kfunc_call_test_kptr_get() and make bpf_kfunc_call_test_acquire()
   require an __rcu pointer.

Continuing on point (3) above, we should _probably_ also have an example
for an object that isn't RCU-protected? I imagine that we don't want to
get rid of KF_KPTR_GET entirely once we have __rcu pointers because some
kptr_get() implementations may synchronize in other ways, such as with
spinlocks or whatever. Let's leave this until after __rcu lands though.

Does this all sound good?

> Maybe some comments around the code would also be helpful on what the
> expectations are. We should make it easy for people to hook into this
> stuff. Nobody reads documentation, people usually only look at
> existing examples.

For sure, can do.