Re: [PATCH] md: don't unregister sync_thread with reconfig_mutex held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Song Liu <song@xxxxxxxxxx> 于2021年2月11日周四 上午8:31写道:
>
> On Tue, Feb 9, 2021 at 6:22 PM Guoqing Jiang
> <guoqing.jiang@xxxxxxxxxxxxxxx> wrote:
> >
> > Unregister sync_thread doesn't need to hold reconfig_mutex since it
> > doesn't reconfigure array.
> >
> > And it could cause deadlock problem for raid5 as follows:
> >
> > 1. process A tried to reap sync thread with reconfig_mutex held after echo
> >    idle to sync_action.
> > 2. raid5 sync thread was blocked if there were too many active stripes.
> > 3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
> >    which causes the number of active stripes can't be decreased.
> > 4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
> >    to hold reconfig_mutex.
> >
> > More details in the link:
> > issu://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@xxxxxxxxxxxxx/T/#t
> >
> > Reported-and-tested-by: Donald Buczek <buczek@xxxxxxxxxxxxx>
> > Signed-off-by: Guoqing Jiang <guoqing.jiang@xxxxxxxxxxxxxxx>
>
> Thanks for debugging the issue. However, I am not sure whether this is
> the proper
> fix. For example, would this break dm-raid.c:raid_message()? IIUC,
> raid_message()
> calls md_reap_sync_thread() without holding reconfigure_mutex, no?
>
> Thanks,
> Song.
right.
A simple solution would be add a parameter to md_reap_sync_thread to
indicate if a reconfigure_mutex lock is held.

Regards!
Jack
>
> > ---
> >  drivers/md/md.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index ca40942..eec8c27 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -9365,13 +9365,18 @@ void md_check_recovery(struct mddev *mddev)
> >  EXPORT_SYMBOL(md_check_recovery);
> >
> >  void md_reap_sync_thread(struct mddev *mddev)
> > +       __releases(&mddev->reconfig_mutex)
> > +       __acquires(&mddev->reconfig_mutex)
> > +
> >  {
> >         struct md_rdev *rdev;
> >         sector_t old_dev_sectors = mddev->dev_sectors;
> >         bool is_reshaped = false;
> >
> >         /* resync has finished, collect result */
> > +       mddev_unlock(mddev);
> >         md_unregister_thread(&mddev->sync_thread);
> > +       mddev_lock_nointr(mddev);
> >         if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
> >             !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
> >             mddev->degraded != mddev->raid_disks) {
> > --
> > 2.7.4
> >




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux