The patch titled Subject: test_xarray: fix soft lockup for advanced-api tests has been added to the -mm mm-unstable branch. Its filename is test_xarray-add-tests-for-advanced-multi-index-use-fix.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/test_xarray-add-tests-for-advanced-multi-index-use-fix.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Luis Chamberlain <mcgrof@xxxxxxxxxx> Subject: test_xarray: fix soft lockup for advanced-api tests Date: Fri, 16 Feb 2024 11:43:29 -0800 The new adanced API tests want to vet the xarray API is doing what it promises by manually iterating over a set of possible indexes on its own, and using a query operation which holds the RCU lock and then releases it. So it is not using the helper loop options which xarray provides on purpose. Any loop which iterates over 1 million entries (which is possible with order 20, so emulating say a 4 GiB block size) to just to rcu lock and unlock will eventually end up triggering a soft lockup on systems which don't preempt, and have lock provin and RCU prooving enabled. xarray users already use XA_CHECK_SCHED for loops which may take a long time, in our case we don't want to RCU unlock and lock as the caller does that already, but rather just force a schedule every XA_CHECK_SCHED iterations since the test is trying to not trust and rather test that xarray is doing the right thing. Link: https://lkml.kernel.org/r/20240216194329.840555-1-mcgrof@xxxxxxxxxx Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> Closes: https://lkml.kernel.org/r/202402071613.70f28243-lkp@xxxxxxxxx Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> Cc: Daniel Gomez <da.gomez@xxxxxxxxxxx> Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx> Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> Cc: Pankaj Raghav <p.raghav@xxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- lib/test_xarray.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) --- a/lib/test_xarray.c~test_xarray-add-tests-for-advanced-multi-index-use-fix +++ a/lib/test_xarray.c @@ -728,6 +728,7 @@ static noinline void *test_get_entry(str { XA_STATE(xas, xa, index); void *p; + static unsigned int i = 0; rcu_read_lock(); repeat: @@ -737,6 +738,17 @@ repeat: goto repeat; rcu_read_unlock(); + /* + * This is not part of the page cache, this selftest is pretty + * aggressive and does not want to trust the xarray API but rather + * test it, and for order 20 (4 GiB block size) we can loop over + * over a million entries which can cause a soft lockup. Page cache + * APIs won't be stupid, proper page cache APIs loop over the proper + * order so when using a larger order we skip shared entries. + */ + if (++i % XA_CHECK_SCHED == 0) + schedule(); + return p; } _ Patches currently in -mm which might be from mcgrof@xxxxxxxxxx are test_xarray-add-tests-for-advanced-multi-index-use.patch test_xarray-add-tests-for-advanced-multi-index-use-fix.patch