Hi,
在 2024/11/05 16:16, Mariusz Tkaczyk 写道:
On Tue, 5 Nov 2024 15:57:33 +0800
Xiao Ni <xni@xxxxxxxxxx> wrote:
One customer reports a bug: raid5 is hung when changing thread cnt
while resync is running. The stripes are all in conf->handle_list
and new threads can't handle them.
Is issue fixed with this patch? Is is missing here :)
Commit b39f35ebe86d ("md: don't quiesce in mddev_suspend()") removes
pers->quiesce from mddev_suspend/resume, then we can't guarantee sync
requests finish in suspend operation. One personality knows itself the
best. So pers->quiesce is a proper way to let personality quiesce.
Fixes: b39f35ebe86d ("md: don't quiesce in mddev_suspend()")
Signed-off-by: Xiao Ni <xni@xxxxxxxxxx>
---
drivers/md/md.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 67108c397c5a..7409ecb2df68 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -482,6 +482,9 @@ int mddev_suspend(struct mddev *mddev, bool interruptible)
return err;
}
+ if (mddev->pers)
+ mddev->pers->quiesce(mddev, 1);
+
Shouldn't it be implemented as below? According to b39f35ebe86d, some levels are
not implementing this?
+ if (mddev->pers && mddev->pers->quiesce)
+ mddev->pers->quiesce(mddev, 1);
It's fine, the fops is never NULL, just some levels points to an empty
function.
The tricky part here is that accessing "mddev->pers" is not safe here,
'reconfig_mutex' is not held in mddev_suspend(), and can concurrent
with, for example, change levels. :(
Thanks,
Kuai
Is it reproducible with upstream kernel?
Thanks,
Mariusz
.