Re: [PATCH RFC 1/1] md: Use pers->quiesce in mddev_suspend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 5, 2024 at 4:29 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> 在 2024/11/05 16:16, Mariusz Tkaczyk 写道:
> > On Tue,  5 Nov 2024 15:57:33 +0800
> > Xiao Ni <xni@xxxxxxxxxx> wrote:
> >
> >> One customer reports a bug: raid5 is hung when changing thread cnt
> >> while resync is running. The stripes are all in conf->handle_list
> >> and new threads can't handle them.
> >
> > Is issue fixed with this patch? Is is missing here :)
> >>
> >> Commit b39f35ebe86d ("md: don't quiesce in mddev_suspend()") removes
> >> pers->quiesce from mddev_suspend/resume, then we can't guarantee sync
> >> requests finish in suspend operation. One personality knows itself the
> >> best. So pers->quiesce is a proper way to let personality quiesce.
> >
> >>
> >> Fixes: b39f35ebe86d ("md: don't quiesce in mddev_suspend()")
> >> Signed-off-by: Xiao Ni <xni@xxxxxxxxxx>
> >> ---
> >>   drivers/md/md.c | 6 ++++++
> >>   1 file changed, 6 insertions(+)
> >>
> >> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >> index 67108c397c5a..7409ecb2df68 100644
> >> --- a/drivers/md/md.c
> >> +++ b/drivers/md/md.c
> >> @@ -482,6 +482,9 @@ int mddev_suspend(struct mddev *mddev, bool interruptible)
> >>              return err;
> >>      }
> >>
> >> +    if (mddev->pers)
> >> +            mddev->pers->quiesce(mddev, 1);
> >> +
> >
> > Shouldn't it be implemented as below? According to b39f35ebe86d, some levels are
> > not implementing this?
> >
> > +     if (mddev->pers && mddev->pers->quiesce)
> > +             mddev->pers->quiesce(mddev, 1);
>
> It's fine, the fops is never NULL, just some levels points to an empty
> function.
>
> The tricky part here is that accessing "mddev->pers" is not safe here,
> 'reconfig_mutex' is not held in mddev_suspend(), and can concurrent
> with, for example, change levels. :(

Hi Kuai

Now mddev->suspended is protected by mddev->suspend_mutex. It should
can avoid the problem you mentioned above? level_store calls
mddev_suspend and mddev->suspended is set to mddev->suspended+1. So
other paths will return because mddev->suspended is not 0.

Regards
Xiao
>
> Thanks,
> Kuai
>
> >
> > Is it reproducible with upstream kernel?
> >
> > Thanks,
> > Mariusz
> >
> > .
> >
>






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux