Re: [PATCH RFC] md: call md_cluster_stop() in __md_stop()

"NeilBrown" <neilb@xxxxxxx> · Tue, 16 Aug 2022 10:58:47 +1000

On Mon, 15 Aug 2022, Guoqing Jiang wrote:
> Hi Neil,
> 
> Sorry for later reply since I was on vacation last week.
> 
> On 8/12/22 10:16 AM, NeilBrown wrote:
> > [[ I noticed the e151 patch recently which seems to admit that it broke
> >     something.  So I looked into it and came up with this.
> 
> I just noticed it ...
> 
> >     It seems to make sense, but I'm not in a position to test it.
> >     Guoqing: does it look OK to you?
> >     - NeilBrown
> > ]]
> >
> > As described in Commit: 48df498daf62 ("md: move bitmap_destroy to the
> > beginning of __md_stop") md_cluster_stop() needs to run before the
> > mdddev->thread is stopped.
> > The change to make this happen was reverted in Commit: e151db8ecfb0
> > ("md-raid: destroy the bitmap after destroying the thread") due to
> > problems it caused.
> >
> > To restore correct behaviour, make md_cluster_stop() reentrant and
> > explicitly call it at the start of __md_stop(), after first calling
> > md_bitmap_wait_behind_writes().
> >
> > Fixes: e151db8ecfb0 ("md-raid: destroy the bitmap after destroying the thread")
> > Signed-off-by: NeilBrown <neilb@xxxxxxx>
> > ---
> >   drivers/md/md-cluster.c | 1 +
> >   drivers/md/md.c         | 3 +++
> >   2 files changed, 4 insertions(+)
> >
> > diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
> > index 742b2349fea3..37bf0aa4ed71 100644
> > --- a/drivers/md/md-cluster.c
> > +++ b/drivers/md/md-cluster.c
> > @@ -1009,6 +1009,7 @@ static int leave(struct mddev *mddev)
> >   	     test_bit(MD_CLOSING, &mddev->flags)))
> >   		resync_bitmap(mddev);
> >   
> > +	mddev->cluster_info = NULL;
> 
> The above makes sense.

Thanks.

> 
> >   	set_bit(MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD, &cinfo->state);
> >   	md_unregister_thread(&cinfo->recovery_thread);
> >   	md_unregister_thread(&cinfo->recv_thread);
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index afaf36b2f6ab..a57b2dff64dd 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -6238,6 +6238,9 @@ static void mddev_detach(struct mddev *mddev)
> >   static void __md_stop(struct mddev *mddev)
> >   {
> >   	struct md_personality *pers = mddev->pers;
> > +
> > +	md_bitmap_wait_behind_writes(mddev);
> > +	md_cluster_stop(mddev);
> >   	mddev_detach(mddev);
> >   	/* Ensure ->event_work is done */
> >   	if (mddev->event_work.func)
> 
> The md_bitmap_destroy is called in __md_stop with or without e151db8ecfb0,
> and it already invokes md_bitmap_wait_behind_writes and md_cluster_stop by
> md_bitmap_free. So the above is sort of redundant to me.

The point is that md_cluster_stop() needs to run before mddev_detach()
shuts down the thread.  If we don't call all of md_bitmap_destroy()
before mddev_detach() we need to at least run md_cluster_stop(), and I
think we need to run md_bitmap_wait_behind_writes() before that.

> 
> For the issue described in e151db8ecfb, looks like raid1d was running after
> __md_stop, I am wondering if dm-raid should stop write first just like 
> normal
> md-raid.

That looks like a really good idea, that should make it safe to move
md_bitmap_destroy() back before mddev_detach().
Would you like to post a patch to make those two changes, and include
Mikulas Patocka, or should I?

Thanks,
NeilBrown

> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index afaf36b2f6ab..afc8d638eba0 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -6260,6 +6260,7 @@ void md_stop(struct mddev *mddev)
>          /* stop the array and free an attached data structures.
>           * This is called from dm-raid
>           */
> +       __md_stop_writes(mddev);
>          __md_stop(mddev);
>          bioset_exit(&mddev->bio_set);
>          bioset_exit(&mddev->sync_set);
> 
> Thanks,
> Guoqing
>