Hi, On Thu, 07 Mar 2024, Nico Pache wrote: > Hi, > > One of the RFC 6803 key derivation kunit subtests is failing. > > cki-project data warehouse : https://datawarehouse.cki-project.org/issue/2514 > > Arches: X86_64, ARM64, S390x, ppc64le > First Appeared: ~6.8.rc2 > > TRACE: > # Subtest: RFC 6803 key derivation > # RFC 6803 key derivation: ASSERTION FAILED at net/sunrpc/auth_gss/gss_krb5_test.c:63 > Expected err == 0, but > err == -110 (0xffffffffffffff92) > not ok 1 Derive Kc subkey for camellia128-cts-cmac > ok 2 Derive Ke subkey for camellia128-cts-cmac > ok 3 Derive Ki subkey for camellia128-cts-cmac > ok 4 Derive Kc subkey for camellia256-cts-cmac > ok 5 Derive Ke subkey for camellia256-cts-cmac > ok 6 Derive Ki subkey for camellia256-cts-cmac > # RFC 6803 key derivation: pass:5 fail:1 skip:0 total:6 > not ok 1 RFC 6803 key derivation This was broken by: c72a870926c2 kunit: add ability to run tests after boot using debugfs __kunit_test_suites_init() runs any time a kernel module is loaded, via the "kunit_mod_nb" notifier_block... even if the kernel module has no kunit tests. But now __kunit_test_suites_init() also locks a mutex, which is a problem if a kunit test itself needs to load a kernel module (which the gss_krb5_test module does). This fixes it for me: ---8<--- diff --git a/lib/kunit/test.c b/lib/kunit/test.c index 088489856db8..18af9453632b 100644 --- a/lib/kunit/test.c +++ b/lib/kunit/test.c @@ -707,6 +707,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_ { unsigned int i; + if (num_suites == 0) + return 0; + if (!kunit_enabled() && num_suites > 0) { pr_info("kunit: disabled\n"); return 0; ---8<--- More detail below: Here's the modprobe command where I loaded the gss_krb5_test module. This process has the "kunit_run_lock" mutex locked: PID: 1468 TASK: ffff9aed0ac20000 CPU: 0 COMMAND: "modprobe" #0 [ffffba974196f6f8] __schedule at ffffffff83fd85f5 #1 [ffffba974196f7b0] schedule at ffffffff83fd9672 #2 [ffffba974196f7c8] schedule_timeout at ffffffff83fe0308 #3 [ffffba974196f818] wait_for_completion_timeout at ffffffff83fda3d4 #4 [ffffba974196f878] kunit_try_catch_run at ffffffffc0d5e851 [kunit] #5 [ffffba974196f8c8] kunit_run_tests at ffffffffc0d5c0ea [kunit] #6 [ffffba974196fb78] __kunit_test_suites_init at ffffffffc0d5c9af [kunit] #7 [ffffba974196fb98] kunit_module_notify at ffffffffc0d5ba4b [kunit] #8 [ffffba974196fc08] notifier_call_chain at ffffffff8314647a #9 [ffffba974196fc40] blocking_notifier_call_chain_robust at ffffffff83146565 #10 [ffffba974196fc88] load_module at ffffffff831e1935 #11 [ffffba974196fde8] __do_sys_init_module at ffffffff831e1fba #12 [ffffba974196fec0] do_syscall_64 at ffffffff83fc3461 #13 [ffffba974196fee8] do_user_addr_fault at ffffffff830979df #14 [ffffba974196ff28] exc_page_fault at ffffffff83fc9c7f #15 [ffffba974196ff50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea RIP: 00007ff1f272b4ae RSP: 00007ffd45db8f68 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 000055bf4c0c4b20 RCX: 00007ff1f272b4ae RDX: 000055bf4b204e79 RSI: 0000000000099691 RDI: 000055bf4cbfd130 RBP: 00007ffd45db9020 R8: 000055bf4c0c4010 R9: 0000000000000007 R10: 0000000000000001 R11: 0000000000000246 R12: 000055bf4b204e79 R13: 0000000000040000 R14: 000055bf4c0c4c50 R15: 000055bf4c0c4390 ORIG_RAX: 00000000000000af CS: 0033 SS: 002b Here's the kunit test case running. It's trying to allocate "cmac(camellia)" via crypto_alloc_shash(): PID: 1508 TASK: ffff9aed155d0000 CPU: 1 COMMAND: "kunit_try_catch" #0 [ffffba974194fba0] __schedule at ffffffff83fd85f5 #1 [ffffba974194fc58] schedule at ffffffff83fd9672 #2 [ffffba974194fc70] schedule_timeout at ffffffff83fe0308 #3 [ffffba974194fcc0] wait_for_completion_killable_timeout at ffffffff83fda708 #4 [ffffba974194fd20] crypto_larval_wait at ffffffff83747fb4 #5 [ffffba974194fd38] crypto_alg_mod_lookup at ffffffff83748252 #6 [ffffba974194fd70] crypto_alloc_tfm_node at ffffffff83748492 #7 [ffffba974194fdb0] krb5_kdf_feedback_cmac at ffffffffc0d76bb2 [rpcsec_gss_krb5] #8 [ffffba974194fe30] kdf_case at ffffffffc0d800a8 [gss_krb5_test] #9 [ffffba974194fe80] kunit_try_run_case at ffffffffc0d5bb54 [kunit] #10 [ffffba974194fee8] kunit_generic_run_threadfn_adapter at ffffffffc0d5e797 [kunit] #11 [ffffba974194fef8] kthread at ffffffff8313eda5 #12 [ffffba974194ff30] ret_from_fork at ffffffff830414a1 #13 [ffffba974194ff50] ret_from_fork_asm at ffffffff830039ab Here the crypto manager is trying to modprobe the camellia kernel module via a usermodehelper call: PID: 1511 TASK: ffff9aed04630000 CPU: 3 COMMAND: "cryptomgr_probe" #0 [ffffba974195fb88] __schedule at ffffffff83fd85f5 #1 [ffffba974195fc40] schedule at ffffffff83fd9672 #2 [ffffba974195fc58] schedule_timeout at ffffffff83fe03c1 #3 [ffffba974195fca8] wait_for_completion_state at ffffffff83fdb06d #4 [ffffba974195fd18] call_usermodehelper_exec at ffffffff83130313 #5 [ffffba974195fd68] __request_module at ffffffff831e325d #6 [ffffba974195fe28] crypto_alg_mod_lookup at ffffffff83748220 #7 [ffffba974195fe60] crypto_grab_spawn at ffffffff83749ff7 #8 [ffffba974195fe98] cmac_create at ffffffff8375c2f0 #9 [ffffba974195fed8] cryptomgr_probe at ffffffff83754a93 #10 [ffffba974195fef8] kthread at ffffffff8313eda5 #11 [ffffba974195ff30] ret_from_fork at ffffffff830414a1 #12 [ffffba974195ff50] ret_from_fork_asm at ffffffff830039ab And here's the resulting modprobe command, which is stuck waiting on the "kunit_run_lock" mutex: PID: 1512 TASK: ffff9aed143fafc0 CPU: 2 COMMAND: "modprobe" #0 [ffffba9741957990] __schedule at ffffffff83fd85f5 #1 [ffffba9741957a48] schedule at ffffffff83fd9672 #2 [ffffba9741957a60] schedule_preempt_disabled at ffffffff83fd9cb5 #3 [ffffba9741957a68] __mutex_lock.constprop.0 at ffffffff83fdc57a #4 [ffffba9741957ae8] __kunit_test_suites_init at ffffffffc0d5c95a [kunit] #5 [ffffba9741957b08] kunit_module_notify at ffffffffc0d5ba4b [kunit] #6 [ffffba9741957b78] notifier_call_chain at ffffffff8314647a #7 [ffffba9741957bb0] blocking_notifier_call_chain_robust at ffffffff83146565 #8 [ffffba9741957bf8] load_module at ffffffff831e1935 #9 [ffffba9741957d58] __do_sys_init_module at ffffffff831e1fba #10 [ffffba9741957e30] do_syscall_64 at ffffffff83fc3461 #11 [ffffba9741957e48] __vm_munmap at ffffffff833bcdeb #12 [ffffba9741957ee8] do_syscall_64 at ffffffff83fc3470 #13 [ffffba9741957f50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea RIP: 00007f8ba092b4ae RSP: 00007ffc771e0378 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 00005572137e6e40 RCX: 00007f8ba092b4ae RDX: 0000557211c4de79 RSI: 0000000000080451 RDI: 00007f8b9ff90010 RBP: 00007ffc771e0430 R8: 00005572137e6010 R9: 0000000000000007 R10: 0000000000000001 R11: 0000000000000246 R12: 0000557211c4de79 R13: 0000000000040000 R14: 00005572137e73b0 R15: 00005572137e6400 ORIG_RAX: 00000000000000af CS: 0033 SS: 002b The camellia module doesn't even have any kunit tests, so __kunit_test_suites_init() is waiting to lock the "kunit_run_lock" mutex for nothing: crash> module -o | grep num_kunit [0x478] int num_kunit_init_suites; [0x488] int num_kunit_suites; crash> mod | grep camellia ffffffffc0da15c0 camellia_x86_64 ffffffffc0d99000 57344 (not loaded) [CONFIG_KALLSYMS] crash> px 0xffffffffc0da15c0+0x478 $1 = 0xffffffffc0da1a38 crash> px 0xffffffffc0da15c0+0x488 $2 = 0xffffffffc0da1a48 crash> rd 0xffffffffc0da1a38 ffffffffc0da1a38: 0000000000000000 ........ crash> rd 0xffffffffc0da1a48 ffffffffc0da1a48: 0000000000000000 ........ -Scott > -- > 2.44.0 > >