On Mon, Feb 26, 2024 at 9:31 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2024/02/23 21:20, Xiao Ni 写道: > > On Fri, Feb 23, 2024 at 11:32 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > >> > >> Hi, > >> > >> 在 2024/02/20 23:30, Xiao Ni 写道: > >>> MD_RECOVERY_WAIT is used by dmraid to delay reshape process by patch > >>> commit 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock"). > >>> Before patch commit f52f5c71f3d4b ("md: fix stopping sync thread") > >>> dmraid stopped sync thread directy by calling md_reap_sync_thread. > >>> After this patch dmraid stops sync thread asynchronously as md does. > >>> This is right. Now the dmraid stop process is like this: > >>> > >>> 1. raid_postsuspend->md_stop_writes->__md_stop_writes->stop_sync_thread. > >>> stop_sync_thread sets MD_RECOVERY_INTR and wait until MD_RECOVERY_RUNNING > >>> is cleared > >>> 2. md_do_sync finds MD_RECOVERY_WAIT is set and return. (This is the > >>> root cause for this deadlock. We hope md_do_sync can set MD_RECOVERY_DONE) > >>> 3. md thread calls md_check_recovery (This is the place to reap sync > >>> thread. Because MD_RECOVERY_DONE is not set. md thread can't reap sync > >>> thread) > >>> 4. raid_dtr stops/free struct mddev and release dmraid related resources > >>> > >>> dmraid only sets MD_RECOVERY_WAIT but doesn't clear it. It needs to clear > >>> this bit when stopping the dmraid before stopping sync thread. > >>> > >>> But the deadlock still can happen sometimes even MD_RECOVERY_WAIT is > >>> cleared before stopping sync thread. It's the reason stop_sync_thread only > >>> wakes up task. If the task isn't running, it still needs to wake up sync > >>> thread too. > >>> > >>> This deadlock can be reproduced 100% by these commands: > >>> modprobe brd rd_size=34816 rd_nr=5 > >>> while [ 1 ]; do > >>> vgcreate test_vg /dev/ram* > >>> lvcreate --type raid5 -L 16M -n test_lv test_vg > >>> lvconvert -y --stripes 4 /dev/test_vg/test_lv > >>> vgremove test_vg -ff > >>> sleep 1 > >>> done > >>> > >>> Fixes: 644e2537fdc7 ("dm raid: fix stripe adding reshape deadlock") > >>> Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") > >>> Signed-off-by: Xiao Ni <xni@xxxxxxxxxx> > >> > >> I'm not sure about this change, I think MD_RECOVERY_WAIT is hacky and > >> really breaks how sync_thread is working, it should just go away soon, > >> once we make sure sync_thread can't be registered before pers->start() > >> is done. > > > > Hi Kuai > > > > I just want to get to a stable state without changing any existing > > logic. After fixing these regressions, we can consider other changes. > > (e.g. remove MD_RECOVERY_WAIT. allow sync thread start/stop when array > > is suspend. ) I talked with Heinz yesterday, for dm-raid, it can't > > allow any io to happen including sync thread when array is suspended. > Hi Kuai > So, are you still thinking that my patchset will allow this for dm-raid? > > I already explain a lot why patch 1 from my set is okay, and why the set > doesn't introduce any behaviour change like you said in this patch 0: I'll read all your patches to understand you well. But as I mentioned many times too, we're fixing regression problems. It's better for us to fix them with the smallest change. As you can see, in my patch set, we can fix these regression problems with small changes (Sorry I didn't notice your patch set has some changes which are the same with mine). So why don't we need such a big change to fix the regression problems? Now with my patch set I can reproduce a problem by lvm2 test suit which happens in 6.6 too. It means with this patch set we can back to a state same with 6.6. > > "Kuai's patch set breaks some rules". > > The only thing that will change is that for md/raid, sync_thrad can > start for suspended array, however, I don't think this will be a problem > because sync_thread can be running for suspended array already, and > 'MD_RECOVERY_FROZEN' is already used to prevent sync_thread to start. We can't allow sync thread happen for dmraid when it's suspended. Because it needs to switch table when suspended. It's a base design. If it can happen now. We should fix this. Best Regards Xiao > > Thanks, > Kuai > > > > > Regards > > Xiao > >> > >> Thanks, > >> Kuai > >>> --- > >>> drivers/md/dm-raid.c | 2 ++ > >>> drivers/md/md.c | 1 + > >>> 2 files changed, 3 insertions(+) > >>> > >>> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c > >>> index eb009d6bb03a..325767c1140f 100644 > >>> --- a/drivers/md/dm-raid.c > >>> +++ b/drivers/md/dm-raid.c > >>> @@ -3796,6 +3796,8 @@ static void raid_postsuspend(struct dm_target *ti) > >>> struct raid_set *rs = ti->private; > >>> > >>> if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) { > >>> + if (test_bit(MD_RECOVERY_WAIT, &rs->md.recovery)) > >>> + clear_bit(MD_RECOVERY_WAIT, &rs->md.recovery); > >>> /* Writes have to be stopped before suspending to avoid deadlocks. */ > >>> if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery)) > >>> md_stop_writes(&rs->md); > >>> diff --git a/drivers/md/md.c b/drivers/md/md.c > >>> index 2266358d8074..54790261254d 100644 > >>> --- a/drivers/md/md.c > >>> +++ b/drivers/md/md.c > >>> @@ -4904,6 +4904,7 @@ static void stop_sync_thread(struct mddev *mddev, bool locked, bool check_seq) > >>> * never happen > >>> */ > >>> md_wakeup_thread_directly(mddev->sync_thread); > >>> + md_wakeup_thread(mddev->sync_thread); > >>> if (work_pending(&mddev->sync_work)) > >>> flush_work(&mddev->sync_work); > >>> > >>> > >> > > > > . > > >