On Tue, Nov 5, 2024 at 4:29 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2024/11/05 16:16, Mariusz Tkaczyk 写道: > > On Tue, 5 Nov 2024 15:57:33 +0800 > > Xiao Ni <xni@xxxxxxxxxx> wrote: > > > >> One customer reports a bug: raid5 is hung when changing thread cnt > >> while resync is running. The stripes are all in conf->handle_list > >> and new threads can't handle them. > > > > Is issue fixed with this patch? Is is missing here :) > >> > >> Commit b39f35ebe86d ("md: don't quiesce in mddev_suspend()") removes > >> pers->quiesce from mddev_suspend/resume, then we can't guarantee sync > >> requests finish in suspend operation. One personality knows itself the > >> best. So pers->quiesce is a proper way to let personality quiesce. > > > >> > >> Fixes: b39f35ebe86d ("md: don't quiesce in mddev_suspend()") > >> Signed-off-by: Xiao Ni <xni@xxxxxxxxxx> > >> --- > >> drivers/md/md.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/drivers/md/md.c b/drivers/md/md.c > >> index 67108c397c5a..7409ecb2df68 100644 > >> --- a/drivers/md/md.c > >> +++ b/drivers/md/md.c > >> @@ -482,6 +482,9 @@ int mddev_suspend(struct mddev *mddev, bool interruptible) > >> return err; > >> } > >> > >> + if (mddev->pers) > >> + mddev->pers->quiesce(mddev, 1); > >> + > > > > Shouldn't it be implemented as below? According to b39f35ebe86d, some levels are > > not implementing this? > > > > + if (mddev->pers && mddev->pers->quiesce) > > + mddev->pers->quiesce(mddev, 1); > > It's fine, the fops is never NULL, just some levels points to an empty > function. > > The tricky part here is that accessing "mddev->pers" is not safe here, > 'reconfig_mutex' is not held in mddev_suspend(), and can concurrent > with, for example, change levels. :( Hi Kuai Now mddev->suspended is protected by mddev->suspend_mutex. It should can avoid the problem you mentioned above? level_store calls mddev_suspend and mddev->suspended is set to mddev->suspended+1. So other paths will return because mddev->suspended is not 0. Regards Xiao > > Thanks, > Kuai > > > > > Is it reproducible with upstream kernel? > > > > Thanks, > > Mariusz > > > > . > > >