On Wed, Jan 10, 2024 at 09:14:51AM +0100, Artem Savkov wrote: > On Tue, Jan 09, 2024 at 11:40:38AM -0800, Yonghong Song wrote: > > > > On 1/9/24 8:43 AM, Artem Savkov wrote: > > > It is possible for bpf_kfunc_call_test_release() to be called from > > > bpf_map_free_deferred() when bpf_testmod is already unloaded and > > > perf_test_stuct.cnt which it tries to decrease is no longer in memory. > > > This patch tries to fix the issue by waiting for all references to be > > > dropped in bpf_testmod_exit(). > > > > > > The issue can be triggered by running 'test_progs -t map_kptr' in 6.5, > > > but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only > > > synchronous grace periods urgently"). > > > > > > Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod") > > > > Please add your Signed-off-by tag. > > Thanks for noticing. Will resend with signed-off-by and your ack. > > > I think the root cause is that bpf_kfunc_call_test_acquire() kfunc > > is defined in bpf_testmod and the kfunc returns some data in bpf_testmod. > > But the release function bpf_kfunc_call_test_release() is in the kernel. > > The release func tries to access some data in bpf_testmod which might > > have been unloaded. The prog_test_ref_kfunc is defined in the kernel, so > > no bpf_testmod btf reference is hold so bpf_testmod can be unloaded before > > bpf_kfunc_call_test_release(). > > As you mentioned, we won't have this issue if bpf_kfunc_call_test_acquire() > > is also in the kernel. > > > > I think putting bpf_kfunc_call_test_acquire() in bpf_testmod and > > bpf_kfunc_call_test_release() in kernel is not a good idea and confusing. > > But since this is only for tests, I guess we can live with that. With that, > > Correct. 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod") > also mentions why bpf_kfunc_call_test_release() is not in the module and > states that this is temporary. I'll add a comment in v2 so the wait can > be removed once the functions are re-united. I somehow recall it has to do with the fact you can't have trusted pointer on module's object, so that's why those structs had to stay in kernel.. but I might be wrong jirka > > > Acked-by: Yonghong Song <yonghong.song@xxxxxxxxx> > > > > > --- > > > tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c > > > index 91907b321f913..63f0dbd016703 100644 > > > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c > > > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c > > > @@ -2,6 +2,7 @@ > > > /* Copyright (c) 2020 Facebook */ > > > #include <linux/btf.h> > > > #include <linux/btf_ids.h> > > > +#include <linux/delay.h> > > > #include <linux/error-injection.h> > > > #include <linux/init.h> > > > #include <linux/module.h> > > > @@ -544,6 +545,9 @@ static int bpf_testmod_init(void) > > > static void bpf_testmod_exit(void) > > > { > > > + while (refcount_read(&prog_test_struct.cnt) > 1) > > > + msleep(20); > > > + > > > return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file); > > > } > > > > -- > Regards, > Artem >