Hi Donald,
On 1/26/21 10:50, Donald Buczek wrote:
[...]
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2d21c298ffa7..f40429843906 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4687,11 +4687,13 @@ action_store(struct mddev *mddev, const char
*page, size_t len)
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) &&
mddev_lock(mddev) == 0) {
+ set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
flush_workqueue(md_misc_wq);
if (mddev->sync_thread) {
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
md_reap_sync_thread(mddev);
}
+ clear_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
mddev_unlock(mddev);
}
} else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
Yes, it could break the deadlock issue, but I am not sure if it is the
right way given we only set ALLOW_SB_UPDATE in suspend which makes
sense since the io will be quiesced, but write idle action can't
guarantee the similar thing.
Thinking (and documenting) MD_ALLOW_SB_UPDATE as "the holder of
reconfig_mutex promises not to make any changes which would exclude
superblocks from being written" might make it easier to accept the usage.
I am not sure it is safe to set the flag here since write idle can't
prevent IO from fs while mddev_suspend can guarantee that.
I prefer to make resync thread not wait forever here.
[...]
- sh = raid5_get_active_stripe(conf, new_sector, previous,
+ sh = raid5_get_active_stripe(conf, new_sector, previous, 0,
Mistake here (fourth argument added instead of third)
Thanks for checking.
[...]
Unfortunately, this patch did not fix the issue.
root@sloth:~/linux# cat /proc/$(pgrep md3_resync)/stack
[<0>] raid5_get_active_stripe+0x1e7/0x6b0
[<0>] raid5_sync_request+0x2a7/0x3d0
[<0>] md_do_sync.cold+0x3ee/0x97c
[<0>] md_thread+0xab/0x160
[<0>] kthread+0x11b/0x140
[<0>] ret_from_fork+0x22/0x30
which is ( according to objdump -d -Sl drivers/md/raid5.o ) at
https://elixir.bootlin.com/linux/v5.11-rc5/source/drivers/md/raid5.c#L735
Isn't it still the case that the superblocks are not written, therefore
stripes are not processed, therefore number of active stripes are not
decreasing? Who is expected to wake up conf->wait_for_stripe waiters?
Hmm, how about wake the waiter up in the while loop of raid5d?
@@ -6520,6 +6532,11 @@ static void raid5d(struct md_thread *thread)
md_check_recovery(mddev);
spin_lock_irq(&conf->device_lock);
}
+
+ if ((atomic_read(&conf->active_stripes)
+ < (conf->max_nr_stripes * 3 / 4) ||
+ (test_bit(MD_RECOVERY_INTR, &mddev->recovery))))
+ wake_up(&conf->wait_for_stripe);
}
pr_debug("%d stripes handled\n", handled);
If the issue still appears then I will change the waiter to break just
if MD_RECOVERY_INTR is set, something like.
wait_event_lock_irq(conf->wait_for_stripe,
(test_bit(MD_RECOVERY_INTR, &mddev->recovery) && sync_req) ||
/* the previous condition */,
*(conf->hash_locks + hash));
Thanks,
Guoqing