Patch "md: always clear ->safemode when md_check_recovery gets the mddev lock." has been added to the 4.12-stable tree

<gregkh@xxxxxxxxxxxxxxxxxxx> · Fri, 18 Aug 2017 16:38:25 -0700

This is a note to let you know that I've just added the patch titled

    md: always clear ->safemode when md_check_recovery gets the mddev lock.

to the 4.12-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     md-always-clear-safemode-when-md_check_recovery-gets-the-mddev-lock.patch
and it can be found in the queue-4.12 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From 33182d15c6bf182f7ae32a66ea4a547d979cd6d7 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@xxxxxxxx>
Date: Tue, 8 Aug 2017 16:56:36 +1000
Subject: md: always clear ->safemode when md_check_recovery gets the mddev lock.

From: NeilBrown <neilb@xxxxxxxx>

commit 33182d15c6bf182f7ae32a66ea4a547d979cd6d7 upstream.

If ->safemode == 1, md_check_recovery() will try to get the mddev lock
and perform various other checks.
If mddev->in_sync is zero, it will call set_in_sync, and clear
->safemode.  However if mddev->in_sync is not zero, ->safemode will not
be cleared.

When md_check_recovery() drops the mddev lock, the thread is woken
up again.  Normally it would just check if there was anything else to
do, find nothing, and go to sleep.  However as ->safemode was not
cleared, it will take the mddev lock again, then wake itself up
when unlocking.

This results in an infinite loop, repeatedly calling
md_check_recovery(), which RCU or the soft-lockup detector
will eventually complain about.

Prior to commit 4ad23a976413 ("MD: use per-cpu counter for
writes_pending"), safemode would only be set to one when the
writes_pending counter reached zero, and would be cleared again
when writes_pending is incremented.  Since that patch, safemode
is set more freely, but is not reliably cleared.

So in md_check_recovery() clear ->safemode before checking ->in_sync.

Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending")
Reported-by: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx>
Reported-by: David R <david@xxxxxxxxxxxxxxx>
Signed-off-by: NeilBrown <neilb@xxxxxxxx>
Signed-off-by: Shaohua Li <shli@xxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
 drivers/md/md.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8639,6 +8639,9 @@ void md_check_recovery(struct mddev *mdd
 	if (mddev_trylock(mddev)) {
 		int spares = 0;
 
+		if (mddev->safemode == 1)
+			mddev->safemode = 0;
+
 		if (mddev->ro) {
 			struct md_rdev *rdev;
 			if (!mddev->external && mddev->in_sync)


Patches currently in stable-queue which might be from neilb@xxxxxxxx are

queue-4.12/md-not-clear-safemode-for-external-metadata-array.patch
queue-4.12/md-always-clear-safemode-when-md_check_recovery-gets-the-mddev-lock.patch
queue-4.12/md-fix-test-in-md_write_start.patch