On 2023-09-28 11:14, Jinjie Ruan wrote: > As Marco pointed out, commit 2810c1e99867 ("kunit: Fix wild-memory-access > bug in kunit_free_suite_set()") causes test suites to run while the test > module is still in MODULE_STATE_COMING. In that state, the module > is not fully initialized, lacking sysfs, module_memory, args, init > function which causes null-ptr-deref of using fake devices below. > > Since load_module() notify MODULE_STATE_COMING in prepare_coming_module(), > and then init sysfs and args etc. in parse_args() and mod_sysfs_setup(), > after that it notify MODULE_STATE_LIVE in do_init_module(), and fake driver > in the test suits depend on them. So the test suits should be executed when > notify MODULE_STATE_LIVE. > > But the kunit_free_suite_set() in kunit_module_exit() depends on the > success of kunit_filter_suites() in kunit_module_init(). The best practice > is to alloc and init resource when notify MODULE_STATE_COMING and free them > when notify MODULE_STATE_GOING. So split the kunit_module_exec() from > kunit_module_init() to run test suits when MODULE_STATE_LIVE, call > kunit_filter_suites() and allocate memory in kunit_module_init() and call > kunit_free_suite_set() in kunit_module_exit() to free the memory. > > So if load_module() succeeds and notify module state as below, it calls > kunit_module_init(), kunit_module_exec() and kunit_module_exit(), which > will work ok. The mod->state state machine when load_module() succeeds: > > kunit_filter_suites() kunit_module_exec() > MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_LIVE > ^ | > | | > +---------------- MODULE_STATE_GOING <---------+ > kunit_free_suite_set() > > If load_module() fails and notify module state as below, it call > kunit_module_init() and kunit_module_exit(), which will also work ok. > The mod->state state machine when load_module() fails at mod_sysfs_setup(): > > kunit_filter_suites() kunit_free_suite_set() > MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_GOING > ^ | > | | > +-----------------------------------------------+ > > general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN > KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] > CPU: 1 PID: 1868 Comm: modprobe Tainted: G W N 6.6.0-rc3+ #61 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 > RIP: 0010:kobject_namespace+0x71/0x150 > Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 cd 00 00 00 48 b8 00 00 00 00 00 fc ff df 49 8b 5c 24 28 48 8d 7b 18 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c1 00 00 00 48 8b 43 18 48 85 c0 74 79 4c 89 e7 > RSP: 0018:ffff88810f797288 EFLAGS: 00010206 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 0000000000000003 RSI: ffffffff847b4900 RDI: 0000000000000018 > RBP: ffff88810ba08940 R08: 0000000000000001 R09: ffffed1021ef2e0f > R10: ffff88810f79707f R11: 746e756f63666572 R12: ffffffffa0241990 > R13: ffff88810ba08958 R14: ffff88810ba08968 R15: ffffffff84ac6c20 > FS: 00007ff9f2186540(0000) GS:ffff888119c80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fff73a2cff8 CR3: 000000010b77b002 CR4: 0000000000770ee0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > ? die_addr+0x3d/0xa0 > ? exc_general_protection+0x144/0x220 > ? asm_exc_general_protection+0x22/0x30 > ? kobject_namespace+0x71/0x150 > kobject_add_internal+0x267/0x870 > kobject_add+0x120/0x1f0 > ? kset_create_and_add+0x160/0x160 > ? __kmem_cache_alloc_node+0x1d2/0x350 > ? _raw_spin_lock+0x87/0xe0 > ? kobject_create_and_add+0x3c/0xb0 > kobject_create_and_add+0x68/0xb0 > module_add_driver+0x260/0x350 > bus_add_driver+0x2c9/0x580 > driver_register+0x133/0x460 > kunit_run_tests+0xdb/0xef0 > ? _prb_read_valid+0x3e3/0x550 > ? _raw_spin_lock+0x87/0xe0 > ? _raw_spin_lock_bh+0xe0/0xe0 > ? __send_ipi_mask+0x1ba/0x450 > ? __pte_offset_map+0x19/0x1f0 > ? __pte_offset_map_lock+0xd6/0x1b0 > ? __kunit_test_suites_exit+0x30/0x30 > ? kvm_smp_send_call_func_ipi+0x68/0xc0 > ? do_sync_core+0x22/0x30 > ? smp_call_function_many_cond+0x1be/0xcf0 > ? __text_poke+0x890/0x890 > ? __text_poke+0x890/0x890 > ? on_each_cpu_cond_mask+0x46/0x70 > ? text_poke_bp_batch+0x413/0x570 > ? do_sync_core+0x30/0x30 > ? __jump_label_patch+0x34c/0x350 > ? mutex_unlock+0x80/0xd0 > ? __mutex_unlock_slowpath.constprop.0+0x2a0/0x2a0 > __kunit_test_suites_init+0xc4/0x120 > kunit_module_notify+0x36c/0x3b0 > ? __kunit_test_suites_init+0x120/0x120 > ? preempt_count_add+0x79/0x150 > notifier_call_chain+0xbf/0x280 > ? kasan_quarantine_put+0x21/0x1a0 > blocking_notifier_call_chain_robust+0xbb/0x140 > ? notifier_call_chain+0x280/0x280 > ? 0xffffffffa0238000 > load_module+0x4af0/0x67d0 > ? module_frob_arch_sections+0x20/0x20 > ? rwsem_down_write_slowpath+0x11a0/0x11a0 > ? kernel_read_file+0x3ca/0x510 > ? __x64_sys_fspick+0x2a0/0x2a0 > ? init_module_from_file+0xd2/0x130 > init_module_from_file+0xd2/0x130 > ? __ia32_sys_init_module+0xa0/0xa0 > ? userfaultfd_unmap_prep+0x3d0/0x3d0 > ? _raw_spin_lock_bh+0xe0/0xe0 > idempotent_init_module+0x339/0x610 > ? init_module_from_file+0x130/0x130 > ? __fget_light+0x57/0x500 > __x64_sys_finit_module+0xba/0x130 > do_syscall_64+0x35/0x80 > entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()") > Reported-by: Marco Pagani <marpagan@xxxxxxxxxx> > Signed-off-by: Jinjie Ruan <ruanjinjie@xxxxxxxxxx> > --- > lib/kunit/test.c | 25 ++++++++++++++++--------- > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git a/lib/kunit/test.c b/lib/kunit/test.c > index 145f70219f46..8fac4783c676 100644 > --- a/lib/kunit/test.c > +++ b/lib/kunit/test.c > @@ -739,7 +739,6 @@ static int kunit_module_init(struct module *mod) > struct kunit_suite_set suite_set = { > mod->kunit_suites, mod->kunit_suites + mod->num_kunit_suites, > }; > - const char *action = kunit_action(); > int err = 0; > > suite_set = kunit_filter_suites(&suite_set, > @@ -752,16 +751,28 @@ static int kunit_module_init(struct module *mod) > mod->kunit_suites = (struct kunit_suite **)suite_set.start; > mod->num_kunit_suites = suite_set.end - suite_set.start; > > - if (!action) > + return err; > +} > + > +static void kunit_module_exec(struct module *mod) > +{ > + struct kunit_suite_set suite_set = { > + mod->kunit_suites, mod->kunit_suites + mod->num_kunit_suites, > + }; > + const char *action = kunit_action(); > + > + if (!action) { > kunit_exec_run_tests(&suite_set, false); > + > + __kunit_test_suites_exit(mod->kunit_suites, > + mod->num_kunit_suites); > + } I don't think destroying debugfs right after running the tests is advisable. The reason why I sent an RFC is to leave room for a discussion on which is the best way to solve the issue. I think it would be better to have a discussion before rushing patches. Thanks, Marco > else if (!strcmp(action, "list")) > kunit_exec_list_tests(&suite_set, false); > else if (!strcmp(action, "list_attr")) > kunit_exec_list_tests(&suite_set, true); > else > pr_err("kunit: unknown action '%s'\n", action); > - > - return err; > } > > static void kunit_module_exit(struct module *mod) > @@ -769,11 +780,6 @@ static void kunit_module_exit(struct module *mod) > struct kunit_suite_set suite_set = { > mod->kunit_suites, mod->kunit_suites + mod->num_kunit_suites, > }; > - const char *action = kunit_action(); > - > - if (!action) > - __kunit_test_suites_exit(mod->kunit_suites, > - mod->num_kunit_suites); > > kunit_free_suite_set(suite_set); > } > @@ -789,6 +795,7 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val, > ret = kunit_module_init(mod); > break; > case MODULE_STATE_LIVE: > + kunit_module_exec(mod); > break; > case MODULE_STATE_GOING: > kunit_module_exit(mod);