On 10/2/24 15:52, Guenter Roeck wrote: > On 10/2/24 03:26, Vlastimil Babka wrote: >> On 10/1/24 18:20, Vlastimil Babka wrote: >>> Guenter Roeck reports that the new slub kunit tests added by commit >>> 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and >>> test_leak_destroy()") cause a lockup on boot on several architectures >>> when the kunit tests are configured to be built-in and not modules. >>> >>> The test_kfree_rcu test invokes kfree_rcu() and boot sequence inspection >>> showed the runner for built-in kunit tests kunit_run_all_tests() is >>> called before setting system_state to SYSTEM_RUNNING and calling >>> rcu_end_inkernel_boot(), so this seems like a likely cause. So while I >>> was unable to reproduce the problem myself, skipping the test when the >>> slub_kunit module is built-in should avoid the issue. >>> >>> An alternative fix that was moving the call to kunit_run_all_tests() a >>> bit later in the boot was tried, but has broken tests with functions >>> marked as __init due to free_initmem() already being done. >>> >>> Fixes: 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") >>> Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx> >>> Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@xxxxxxxxxxxx/ >> >> I hope you can confirm it helps, because the commit added two tests and I've >> only skipped one of them, as it's the one using kfree_rcu(), which is >> suspected. But the other is responsible for the (now suppressed) >> kmem_cache_destroy() warning, and maybe I'm missing something and it was >> actually that one causing the lockups. >> > > Everything works with your patches applied, so we are good. Thanks for testing! Queued for -next now and will send to Linus later if all's good. >> Since you mentioned the boot lockups happened on some x86_64 too, do you >> have a .config of the lockup case? I've tried tweaking some rcu options but >> still nothing. >> > > I have a bunch of debug options enabled. Configuration (generated using > "make savedefconfig") for x86_64 is attached. Hmm, didn't see the hang with that (using virtme-ng) on v6.12-rc1. Guess there's something more to it. Oh well. > Thanks, > Guenter