This is a note to let you know that I've just added the patch titled md: Don't ignore suspended array in md_check_recovery() to the 6.7-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: md-don-t-ignore-suspended-array-in-md_check_recovery.patch and it can be found in the queue-6.7 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 1baae052cccd08daf9a9d64c3f959d8cdb689757 Mon Sep 17 00:00:00 2001 From: Yu Kuai <yukuai3@xxxxxxxxxx> Date: Thu, 1 Feb 2024 17:25:46 +0800 Subject: md: Don't ignore suspended array in md_check_recovery() From: Yu Kuai <yukuai3@xxxxxxxxxx> commit 1baae052cccd08daf9a9d64c3f959d8cdb689757 upstream. mddev_suspend() never stop sync_thread, hence it doesn't make sense to ignore suspended array in md_check_recovery(), which might cause sync_thread can't be unregistered. After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following hang can be triggered by test shell/integrity-caching.sh: 1) suspend the array: raid_postsuspend mddev_suspend 2) stop the array: raid_dtr md_stop __md_stop_writes stop_sync_thread set_bit(MD_RECOVERY_INTR, &mddev->recovery); md_wakeup_thread_directly(mddev->sync_thread); wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) 3) sync thread done: md_do_sync set_bit(MD_RECOVERY_DONE, &mddev->recovery); md_wakeup_thread(mddev->thread); 4) daemon thread can't unregister sync thread: md_check_recovery if (mddev->suspended) return; -> return directly md_read_sync_thread clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery); -> MD_RECOVERY_RUNNING can't be cleared, hence step 2 hang; This problem is not just related to dm-raid, fix it by ignoring suspended array in md_check_recovery(). And follow up patches will improve dm-raid better to frozen sync thread during suspend. Reported-by: Mikulas Patocka <mpatocka@xxxxxxxxxx> Closes: https://lore.kernel.org/all/8fb335e-6d2c-dbb5-d7-ded8db5145a@xxxxxxxxxx/ Fixes: 68866e425be2 ("MD: no sync IO while suspended") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Cc: stable@xxxxxxxxxxxxxxx # v6.7+ Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> Signed-off-by: Song Liu <song@xxxxxxxxxx> Link: https://lore.kernel.org/r/20240201092559.910982-2-yukuai1@xxxxxxxxxxxxxxx Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/md/md.c | 3 --- 1 file changed, 3 deletions(-) --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9519,9 +9519,6 @@ not_running: */ void md_check_recovery(struct mddev *mddev) { - if (READ_ONCE(mddev->suspended)) - return; - if (mddev->bitmap) md_bitmap_daemon_work(mddev); Patches currently in stable-queue which might be from yukuai3@xxxxxxxxxx are queue-6.7/md-fix-missing-release-of-active_io-for-flush.patch queue-6.7/md-don-t-register-sync_thread-for-reshape-directly.patch queue-6.7/md-make-sure-md_do_sync-will-set-md_recovery_done.patch queue-6.7/md-don-t-suspend-the-array-for-interrupted-reshape.patch queue-6.7/md-don-t-ignore-suspended-array-in-md_check_recovery.patch queue-6.7/md-don-t-ignore-read-only-array-in-md_check_recovery.patch