Patch "md: Don't suspend the array for interrupted reshape" has been added to the 6.7-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    md: Don't suspend the array for interrupted reshape

to the 6.7-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     md-don-t-suspend-the-array-for-interrupted-reshape.patch
and it can be found in the queue-6.7 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From 9e46c70e829bddc24e04f963471e9983a11598b7 Mon Sep 17 00:00:00 2001
From: Yu Kuai <yukuai3@xxxxxxxxxx>
Date: Thu, 1 Feb 2024 17:25:50 +0800
Subject: md: Don't suspend the array for interrupted reshape

From: Yu Kuai <yukuai3@xxxxxxxxxx>

commit 9e46c70e829bddc24e04f963471e9983a11598b7 upstream.

md_start_sync() will suspend the array if there are spares that can be
added or removed from conf, however, if reshape is still in progress,
this won't happen at all or data will be corrupted(remove_and_add_spares
won't be called from md_choose_sync_action for reshape), hence there is
no need to suspend the array if reshape is not done yet.

Meanwhile, there is a potential deadlock for raid456:

1) reshape is interrupted;

2) set one of the disk WantReplacement, and add a new disk to the array,
   however, recovery won't start until the reshape is finished;

3) then issue an IO across reshpae position, this IO will wait for
   reshape to make progress;

4) continue to reshape, then md_start_sync() found there is a spare disk
   that can be added to conf, mddev_suspend() is called;

Step 4 and step 3 is waiting for each other, deadlock triggered. Noted
this problem is found by code review, and it's not reporduced yet.

Fix this porblem by don't suspend the array for interrupted reshape,
this is safe because conf won't be changed until reshape is done.

Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration")
Cc: stable@xxxxxxxxxxxxxxx # v6.7+
Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
Signed-off-by: Song Liu <song@xxxxxxxxxx>
Link: https://lore.kernel.org/r/20240201092559.910982-6-yukuai1@xxxxxxxxxxxxxxx
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
 drivers/md/md.c |   13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9424,12 +9424,17 @@ static void md_start_sync(struct work_st
 	bool suspend = false;
 	char *name;
 
-	if (md_spares_need_change(mddev))
+	/*
+	 * If reshape is still in progress, spares won't be added or removed
+	 * from conf until reshape is done.
+	 */
+	if (mddev->reshape_position == MaxSector &&
+	    md_spares_need_change(mddev)) {
 		suspend = true;
+		mddev_suspend(mddev, false);
+	}
 
-	suspend ? mddev_suspend_and_lock_nointr(mddev) :
-		  mddev_lock_nointr(mddev);
-
+	mddev_lock_nointr(mddev);
 	if (!md_is_rdwr(mddev)) {
 		/*
 		 * On a read-only array we can:


Patches currently in stable-queue which might be from yukuai3@xxxxxxxxxx are

queue-6.7/md-fix-missing-release-of-active_io-for-flush.patch
queue-6.7/md-don-t-register-sync_thread-for-reshape-directly.patch
queue-6.7/md-make-sure-md_do_sync-will-set-md_recovery_done.patch
queue-6.7/md-don-t-suspend-the-array-for-interrupted-reshape.patch
queue-6.7/md-don-t-ignore-suspended-array-in-md_check_recovery.patch
queue-6.7/md-don-t-ignore-read-only-array-in-md_check_recovery.patch




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux