Hi,
在 2024/07/19 16:02, Mateusz Kusiak 写道:
On 19.07.2024 09:02, Yu Kuai wrote:
Hi,
With some discussion and log collection, looks like this is a deadlock
introduced by:
https://lore.kernel.org/r/20230825031622.1530464-8-yukuai1@xxxxxxxxxxxxxxx
Root cause is that:
1) New io is blocked because array is suspended;
2) md_start_sync suspend the array, and it's waiting for inflight IO
to be done;
3) inflight IO is waiting for md_start_sync to be done, from
md_start_write->flush_work().
Can you give following patch a test?
Thanks!
Kuai
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 64693913ed18..10c2d816062a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8668,7 +8668,6 @@ void md_write_start(struct mddev *mddev, struct
bio *bi)
BUG_ON(mddev->ro == MD_RDONLY);
if (mddev->ro == MD_AUTO_READ) {
/* need to switch to read/write */
- flush_work(&mddev->sync_work);
mddev->ro = MD_RDWR;
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
Hi Kuai,
With the patch you provided the issue still reproduces.
Thanks for the test, then after eliminating this problem, can we collect
log with this patch?
Thanks,
Kuai
Thanks,
Mateusz
.