On Fri, May 6, 2022 at 4:37 AM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote: > > From: Guoqing Jiang <guoqing.jiang@xxxxxxxxxxxxxxx> > > Unregister sync_thread doesn't need to hold reconfig_mutex since it > doesn't reconfigure array. > > And it could cause deadlock problem for raid5 as follows: > > 1. process A tried to reap sync thread with reconfig_mutex held after echo > idle to sync_action. > 2. raid5 sync thread was blocked if there were too many active stripes. > 3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer) > which causes the number of active stripes can't be decreased. > 4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able > to hold reconfig_mutex. > > More details in the link: > https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@xxxxxxxxxxxxx/T/#t > > Let's call unregister thread between mddev_unlock and mddev_lock_nointr > (thanks for the report from kernel test robot <lkp@xxxxxxxxx>) if the > reconfig_mutex is held, and mddev_is_locked is introduced accordingly. mddev_is_locked() feels really hacky to me. It cannot tell whether mddev is locked by current thread. So technically, we can unlock reconfigure_mutex for other thread by accident, no? > > Reported-by: Donald Buczek <buczek@xxxxxxxxxxxxx> > Signed-off-by: Guoqing Jiang <guoqing.jiang@xxxxxxxxx> > --- > drivers/md/md.c | 9 ++++++++- > drivers/md/md.h | 5 +++++ > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 309b3af906ad..525f65682356 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -9432,10 +9432,17 @@ void md_reap_sync_thread(struct mddev *mddev) > { > struct md_rdev *rdev; > sector_t old_dev_sectors = mddev->dev_sectors; > - bool is_reshaped = false; > + bool is_reshaped = false, is_locked = false; > > + if (mddev_is_locked(mddev)) { > + is_locked = true; > + mddev_unlock(mddev); > + } > /* resync has finished, collect result */ > md_unregister_thread(&mddev->sync_thread); > + if (is_locked) > + mddev_lock_nointr(mddev); > + > if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) && > !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) && > mddev->degraded != mddev->raid_disks) { > diff --git a/drivers/md/md.h b/drivers/md/md.h > index 6ac283864533..af6f3978b62b 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -549,6 +549,11 @@ static inline int mddev_trylock(struct mddev *mddev) > } > extern void mddev_unlock(struct mddev *mddev); > > +static inline int mddev_is_locked(struct mddev *mddev) > +{ > + return mutex_is_locked(&mddev->reconfig_mutex); > +} > + > static inline void md_sync_acct(struct block_device *bdev, unsigned long nr_sectors) > { > atomic_add(nr_sectors, &bdev->bd_disk->sync_io); > -- > 2.31.1 >