This is a note to let you know that I've just added the patch titled md: always clear ->safemode when md_check_recovery gets the mddev lock. to the 4.12-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: md-always-clear-safemode-when-md_check_recovery-gets-the-mddev-lock.patch and it can be found in the queue-4.12 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 33182d15c6bf182f7ae32a66ea4a547d979cd6d7 Mon Sep 17 00:00:00 2001 From: NeilBrown <neilb@xxxxxxxx> Date: Tue, 8 Aug 2017 16:56:36 +1000 Subject: md: always clear ->safemode when md_check_recovery gets the mddev lock. From: NeilBrown <neilb@xxxxxxxx> commit 33182d15c6bf182f7ae32a66ea4a547d979cd6d7 upstream. If ->safemode == 1, md_check_recovery() will try to get the mddev lock and perform various other checks. If mddev->in_sync is zero, it will call set_in_sync, and clear ->safemode. However if mddev->in_sync is not zero, ->safemode will not be cleared. When md_check_recovery() drops the mddev lock, the thread is woken up again. Normally it would just check if there was anything else to do, find nothing, and go to sleep. However as ->safemode was not cleared, it will take the mddev lock again, then wake itself up when unlocking. This results in an infinite loop, repeatedly calling md_check_recovery(), which RCU or the soft-lockup detector will eventually complain about. Prior to commit 4ad23a976413 ("MD: use per-cpu counter for writes_pending"), safemode would only be set to one when the writes_pending counter reached zero, and would be cleared again when writes_pending is incremented. Since that patch, safemode is set more freely, but is not reliably cleared. So in md_check_recovery() clear ->safemode before checking ->in_sync. Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending") Reported-by: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx> Reported-by: David R <david@xxxxxxxxxxxxxxx> Signed-off-by: NeilBrown <neilb@xxxxxxxx> Signed-off-by: Shaohua Li <shli@xxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/md/md.c | 3 +++ 1 file changed, 3 insertions(+) --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8639,6 +8639,9 @@ void md_check_recovery(struct mddev *mdd if (mddev_trylock(mddev)) { int spares = 0; + if (mddev->safemode == 1) + mddev->safemode = 0; + if (mddev->ro) { struct md_rdev *rdev; if (!mddev->external && mddev->in_sync) Patches currently in stable-queue which might be from neilb@xxxxxxxx are queue-4.12/md-not-clear-safemode-for-external-metadata-array.patch queue-4.12/md-always-clear-safemode-when-md_check_recovery-gets-the-mddev-lock.patch queue-4.12/md-fix-test-in-md_write_start.patch