Re: [PATCH v3 1/3] md/raid1: freeze array more strictly when reshape

Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> · Tue, 1 Aug 2023 09:24:13 +0800

Hi,

在 2023/07/31 22:02, Xueshi Hu 写道:
On Thu, Jul 20, 2023 at 09:37:38AM +0800, Yu Kuai wrote:
Hi,

在 2023/07/20 9:36, Yu Kuai 写道:
Hi,

在 2023/07/19 15:09, Xueshi Hu 写道:
When an IO error happens, reschedule_retry() will increase
r1conf::nr_queued, which makes freeze_array() unblocked. However, before
all r1bio in the memory pool are released, the memory pool should not be
modified. Introduce freeze_array_totally() to solve the problem. Compared
to freeze_array(), it's more strict because any in-flight io needs to
complete including queued io.

Signed-off-by: Xueshi Hu <xueshi.hu@xxxxxxxxxx>
---
   drivers/md/raid1.c | 35 +++++++++++++++++++++++++++++++++--
   1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index dd25832eb045..5605c9680818 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1072,7 +1072,7 @@ static void freeze_array(struct r1conf *conf,
int extra)
       /* Stop sync I/O and normal I/O and wait for everything to
        * go quiet.
        * This is called in two situations:
-     * 1) management command handlers (reshape, remove disk, quiesce).
+     * 1) management command handlers (remove disk, quiesce).
        * 2) one normal I/O request failed.
        * After array_frozen is set to 1, new sync IO will be blocked at
@@ -1111,6 +1111,37 @@ static void unfreeze_array(struct r1conf *conf)
       wake_up(&conf->wait_barrier);
   }
+/* conf->resync_lock should be held */
+static int get_pending(struct r1conf *conf)
+{
+    int idx, ret;
+
+    ret = atomic_read(&conf->nr_sync_pending);
+    for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++)
+        ret += atomic_read(&conf->nr_pending[idx]);
+
+    return ret;
+}
+
+static void freeze_array_totally(struct r1conf *conf)
+{
+    /*
+     * freeze_array_totally() is almost the same with
freeze_array() except
+     * it requires there's no queued io. Raid1's reshape will
destroy the
+     * old mempool and change r1conf::raid_disks, which are
necessary when
+     * freeing the queued io.
+     */
+    spin_lock_irq(&conf->resync_lock);
+    conf->array_frozen = 1;
+    raid1_log(conf->mddev, "freeze totally");
+    wait_event_lock_irq_cmd(
+            conf->wait_barrier,
+            get_pending(conf) == 0,
+            conf->resync_lock,
+            md_wakeup_thread(conf->mddev->thread));
+    spin_unlock_irq(&conf->resync_lock);
+}
+
   static void alloc_behind_master_bio(struct r1bio *r1_bio,
                          struct bio *bio)
   {
@@ -3296,7 +3327,7 @@ static int raid1_reshape(struct mddev *mddev)
           return -ENOMEM;
       }
-    freeze_array(conf, 0);
+    freeze_array_totally(conf);

I think this is wrong, raid1_reshape() can't be called with
Sorry about thi typo, I mean raid1_reshape() can be called with ...
You're right, this is indeed a deadlock.

I am wondering whether this approach is viable

	if (unlikely(atomic_read(conf->nr_queued))) {
		kfree(newpoolinfo);
		mempool_exit(&newpool);
		unfreeze_array(conf);

		set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
		set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
		md_wakeup_thread(mddev->thread);
		return -EBUSY;
	}

This is not okay, 'nr_queued' can be incresed at any time when normal io
failed, read it once doesn't mean anything, and you need to
freeze_array() before reading it:

freeze_array
// guarantee new io won't be dispatched
if (atomic_read(conf->nr_queued))
 ...
 unfreeze_array
 return -EBUSY;

Fortunately, I'm working on another patchset to synchronize io with
array configuration, which means all the callers of raid1_reshape() will
suspend the array, and no normal io will be in progress, hence this
problem won't exist as well.

Thanks,
Kuai


Thanks,
Hu


Thanks,
Kuai
'reconfig_mutex' grabbed, and this will deadlock because failed io need
this lock to be handled by daemon thread.(see details in [1]).

Be aware that never hold 'reconfig_mutex' to wait for io.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c4fe7edfc73f750574ef0ec3eee8c2de95324463

       /* ok, everything is stopped */
       oldpool = conf->r1bio_pool;


.


.