On Mon, 15 Aug 2022, Guoqing Jiang wrote: > Hi Neil, > > Sorry for later reply since I was on vacation last week. > > On 8/12/22 10:16 AM, NeilBrown wrote: > > [[ I noticed the e151 patch recently which seems to admit that it broke > > something. So I looked into it and came up with this. > > I just noticed it ... > > > It seems to make sense, but I'm not in a position to test it. > > Guoqing: does it look OK to you? > > - NeilBrown > > ]] > > > > As described in Commit: 48df498daf62 ("md: move bitmap_destroy to the > > beginning of __md_stop") md_cluster_stop() needs to run before the > > mdddev->thread is stopped. > > The change to make this happen was reverted in Commit: e151db8ecfb0 > > ("md-raid: destroy the bitmap after destroying the thread") due to > > problems it caused. > > > > To restore correct behaviour, make md_cluster_stop() reentrant and > > explicitly call it at the start of __md_stop(), after first calling > > md_bitmap_wait_behind_writes(). > > > > Fixes: e151db8ecfb0 ("md-raid: destroy the bitmap after destroying the thread") > > Signed-off-by: NeilBrown <neilb@xxxxxxx> > > --- > > drivers/md/md-cluster.c | 1 + > > drivers/md/md.c | 3 +++ > > 2 files changed, 4 insertions(+) > > > > diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c > > index 742b2349fea3..37bf0aa4ed71 100644 > > --- a/drivers/md/md-cluster.c > > +++ b/drivers/md/md-cluster.c > > @@ -1009,6 +1009,7 @@ static int leave(struct mddev *mddev) > > test_bit(MD_CLOSING, &mddev->flags))) > > resync_bitmap(mddev); > > > > + mddev->cluster_info = NULL; > > The above makes sense. Thanks. > > > set_bit(MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD, &cinfo->state); > > md_unregister_thread(&cinfo->recovery_thread); > > md_unregister_thread(&cinfo->recv_thread); > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index afaf36b2f6ab..a57b2dff64dd 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -6238,6 +6238,9 @@ static void mddev_detach(struct mddev *mddev) > > static void __md_stop(struct mddev *mddev) > > { > > struct md_personality *pers = mddev->pers; > > + > > + md_bitmap_wait_behind_writes(mddev); > > + md_cluster_stop(mddev); > > mddev_detach(mddev); > > /* Ensure ->event_work is done */ > > if (mddev->event_work.func) > > The md_bitmap_destroy is called in __md_stop with or without e151db8ecfb0, > and it already invokes md_bitmap_wait_behind_writes and md_cluster_stop by > md_bitmap_free. So the above is sort of redundant to me. The point is that md_cluster_stop() needs to run before mddev_detach() shuts down the thread. If we don't call all of md_bitmap_destroy() before mddev_detach() we need to at least run md_cluster_stop(), and I think we need to run md_bitmap_wait_behind_writes() before that. > > For the issue described in e151db8ecfb, looks like raid1d was running after > __md_stop, I am wondering if dm-raid should stop write first just like > normal > md-raid. That looks like a really good idea, that should make it safe to move md_bitmap_destroy() back before mddev_detach(). Would you like to post a patch to make those two changes, and include Mikulas Patocka, or should I? Thanks, NeilBrown > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index afaf36b2f6ab..afc8d638eba0 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -6260,6 +6260,7 @@ void md_stop(struct mddev *mddev) > /* stop the array and free an attached data structures. > * This is called from dm-raid > */ > + __md_stop_writes(mddev); > __md_stop(mddev); > bioset_exit(&mddev->bio_set); > bioset_exit(&mddev->sync_set); > > Thanks, > Guoqing >