On Thu, 2023-04-20 at 19:54 -0400, Dave Marchevsky wrote: > Hi Eduard and Florian, thanks for finding this issue. I am looking into the > bpf_refcount side of things, and IIUC Eduard just submitted a series fixing > __retval / test infra. > > I'm having trouble reproducing the refcount-specific splats, would you mind > sharing .config? Here is what I use: https://gist.github.com/eddyz87/6c13e1783b5ae4b11b2d9e29fbe5ee49 > > On 4/20/23 7:22 PM, Eduard Zingerman wrote: > > [...] > > Looking at the error and at the test source code, it appears to me > > that there is an API misuse for the `refcount_t` type. > > > > The `bpf_refcount_acquire` invocation in the test expands as a call to > > bpf_refcount_acquire_impl(), which treats passed pointer as a value of > > `refcount_t`: > > > > /* Description > > * Increment the refcount on a refcounted local kptr, turning the > > * non-owning reference input into an owning reference in the process. > > * > > * The 'meta' parameter is rewritten by the verifier, no need for BPF > > * program to set it. > > * Returns > > * An owning reference to the object pointed to by 'kptr' > > */ > > extern void *bpf_refcount_acquire_impl(void *kptr, void *meta) __ksym; > > > > __bpf_kfunc void *bpf_refcount_acquire_impl(void *p__refcounted_kptr, void *meta__ign) > > { > > ... > > refcount_inc((refcount_t *)ref); > > return (void *)p__refcounted_kptr; > > } > > > > The comment for `refcount_inc` says: > > > > /** > > * ... > > * Will WARN if the refcount is 0, as this represents a possible use-after-free > > * condition. > > */ > > static inline void refcount_inc(refcount_t *r) > > > > And looking at block/blk-core.c as an example, refcount_t is supposed > > to be used as follows: > > - upon object creation it's refcount is set to 1: > > refcount_set(&q->refs, 1); > > - when reference is added the refcount is incremented: > > refcount_inc(&q->refs); > > - when reference is removed the refcount is decremented and checked: > > if (refcount_dec_and_test(&q->refs)) > > blk_free_queue(q); > > > > So, 0 is not actually a valid value for refcount_t instance. And I > > don't see any calls to refcount_set() in kernel/bpf/helpers.c, which > > implements bpf_refcount_acquire_impl(). > > > > Dave, I see that bpf_refcount_acquire_impl() was recently added by you, > > could you please take a look? > > > > refcount_set is called from bpf_obj_init_field in include/linux/bpf.h . > Maybe there is some scenario where the bpf_refcount doesn't pass through > bpf_obj_init_field... Regardless, digging now. Oh, sorry, missed that. I should have added some printks before posting... > > > The TLDR: of a thread: > > - __retval is currently ignored; > > - to fix it apply the patch below; > > - running the following command several times produces lot's > > of nasty errors in dmesg: > > > > $ for i in $(seq 1 4); do (./test_progs --allow=refcounted_kptr &); done > > > > diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c > > index 47e9e076bc8f..e2a1bdc5a570 100644 > > --- a/tools/testing/selftests/bpf/test_loader.c > > +++ b/tools/testing/selftests/bpf/test_loader.c > > @@ -587,7 +587,7 @@ void run_subtest(struct test_loader *tester, > > /* For some reason test_verifier executes programs > > * with all capabilities restored. Do the same here. > > */ > > - if (!restore_capabilities(&caps)) > > + if (restore_capabilities(&caps)) > > goto tobj_cleanup; > > > > do_prog_test_run(bpf_program__fd(tprog), &retval); > >