On Fri, Feb 23, 2024 at 6:31 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2024/02/20 23:30, Xiao Ni 写道: > > MD_RECOVERY_WAIT is used by dmraid to delay reshape process by patch > > commit 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock"). > > Before patch commit f52f5c71f3d4b ("md: fix stopping sync thread") > > dmraid stopped sync thread directy by calling md_reap_sync_thread. > > After this patch dmraid stops sync thread asynchronously as md does. > > This is right. Now the dmraid stop process is like this: > > > > 1. raid_postsuspend->md_stop_writes->__md_stop_writes->stop_sync_thread. > > stop_sync_thread sets MD_RECOVERY_INTR and wait until MD_RECOVERY_RUNNING > > is cleared > > 2. md_do_sync finds MD_RECOVERY_WAIT is set and return. (This is the > > root cause for this deadlock. We hope md_do_sync can set MD_RECOVERY_DONE) > > 3. md thread calls md_check_recovery (This is the place to reap sync > > thread. Because MD_RECOVERY_DONE is not set. md thread can't reap sync > > thread) > > 4. raid_dtr stops/free struct mddev and release dmraid related resources > > > > dmraid only sets MD_RECOVERY_WAIT but doesn't clear it. It needs to clear > > this bit when stopping the dmraid before stopping sync thread. > > > > But the deadlock still can happen sometimes even MD_RECOVERY_WAIT is > > cleared before stopping sync thread. It's the reason stop_sync_thread only > > wakes up task. If the task isn't running, it still needs to wake up sync > > thread too. > > > > This deadlock can be reproduced 100% by these commands: > > modprobe brd rd_size=34816 rd_nr=5 > > while [ 1 ]; do > > vgcreate test_vg /dev/ram* > > lvcreate --type raid5 -L 16M -n test_lv test_vg > > lvconvert -y --stripes 4 /dev/test_vg/test_lv > > vgremove test_vg -ff > > sleep 1 > > done > > > > Fixes: 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock") > > Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") > > Signed-off-by: Xiao Ni <xni@xxxxxxxxxx> > > --- > > drivers/md/dm-raid.c | 2 ++ > > drivers/md/md.c | 1 + > > 2 files changed, 3 insertions(+) > > > > diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c > > index eb009d6bb03a..325767c1140f 100644 > > --- a/drivers/md/dm-raid.c > > +++ b/drivers/md/dm-raid.c > > @@ -3796,6 +3796,8 @@ static void raid_postsuspend(struct dm_target *ti) > > struct raid_set *rs = ti->private; > > > > if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) { > > + if (test_bit(MD_RECOVERY_WAIT, &rs->md.recovery)) > > + clear_bit(MD_RECOVERY_WAIT, &rs->md.recovery); > > Notice that 'MD_RECOVERY_WAIT' will never be cleared, hence sync_thread > will never make progress until table reload for dm-raid. Hi Kuai After this patch, it indeed fix the problem mentioned in this patch. So it can be cleared before stopping sync thread. I don't know why you say it never be cleared. > > And other than stopping dm-raid, raid_postsuspend() call also be called > by ioctl to suspend dm-raid, hence this change is wrong. >From code review, raid_postsuspend is used to stop sync thread and suspend array. Maybe I don't understand right. If I'm right, it needs to clear MD_RECOVERY_WAIT before stopping sync thread. > > I think we can never clear 'MD_RECOVERY_FROZEN' in this case so that > 'MD_RECOVERY_WAIT' can be removed, or use '!MD_RECOVERY_WAIT' as a > condition to register new sync_thread. I don't understand you here. From debug, there are three reloads during dmraidd reshape. In the second reload, it doesn't want to start sync thread. It's the reason that it sets MD_RECOVERY_WAIT because dmraid is still not ready. In the third reload, it stops the mddev which is created in the second reload and create a new mddev. During this process, MD_RECOVERY_WAIT always works until suspend mddev which is created in the second mddev. It has no relationship with MD_RECOVERY_FROZEN. Regards Xiao > > Thanks, > Kuai > > /* Writes have to be stopped before suspending to avoid deadlocks. */ > > if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery)) > > md_stop_writes(&rs->md); > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index 2266358d8074..54790261254d 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -4904,6 +4904,7 @@ static void stop_sync_thread(struct mddev *mddev, bool locked, bool check_seq) > > * never happen > > */ > > md_wakeup_thread_directly(mddev->sync_thread); > > + md_wakeup_thread(mddev->sync_thread); > > if (work_pending(&mddev->sync_work)) > > flush_work(&mddev->sync_work); > > > > >