Re: [PATCH] md: only unlock mddev from action_store

Xiao Ni <xni@xxxxxxxxxx> · Mon, 6 Jun 2022 17:36:31 +0800

On Mon, Jun 6, 2022 at 5:00 PM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote:
>
>
>
> On 6/6/22 4:39 PM, Xiao Ni wrote:
> > On Mon, Jun 6, 2022 at 11:08 AM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote:
> >>
> >>
> >>>> +    }
> >>> Maybe instead of the ugly boolean argument we could pull
> >>> md_unregister_thread() into all the callers and explicitly unlock in the
> >>> single call site that needs it?
> >> Sounds good, let me revert previous commit then pull it.
> >>
> > Hi all
> >
> > Now md_reap_sync_thread is called by __md_stop_writes, action_store
> > and md_check_recovery.
> > If we move md_unregister_thread out of md_reap_sync_thread and unlock
> > reconfig_mutex before
> > calling md_unregister_thread, it means we break the atomic action for
> > these three functions mentioned
> > above.
>
> No, only action_store need to be changed, other callers keep the original
> behavior.

Hi Guoqing

Thanks for pointing out this.

>
> >   Not sure if it can introduce other problems, especially for the
> > change of md_check_recovery.
> >
> > How about suspend I/O when changing the sync action? It should not
> > hurt the performance, because
> > it only interrupts the sync action and flush the in memory stripes.
> > For this way, it needs to revert
> > 7e6ba434cc60 (md: don't unregister sync_thread with reconfig_mutex
> > held) and changes like this:
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index ce9d2845d3ac..af28cdeaf7e1 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -4827,12 +4827,14 @@ action_store(struct mddev *mddev, const char
> > *page, size_t len)
> >                          clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
> >                  if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) &&
> >                      mddev_lock(mddev) == 0) {
> > +                       mddev_suspend(mddev);
> >                          if (work_pending(&mddev->del_work))
> >                                  flush_workqueue(md_misc_wq);
> >                          if (mddev->sync_thread) {
> >                                  set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> >                                  md_reap_sync_thread(mddev);
> >                          }
> > +                       mddev_resume(mddev);
> >                          mddev_unlock(mddev);
> >                  }
> >          } else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
>
> The action actually means sync action which are for internal IO instead
> of external IO, I suppose the semantic is different with above change.

The original problem should be i/o happen when interrupting the sync thread?
So the external i/o calls md_write_start and sets MD_SB_CHANGE_PENDING.
Then raid5d->md_check_recovery can't update superblock and handle internal
stripes. So the `echo idle` action is stuck.

>
> And there are lots of sync/locking actions in suspend path, I am not
> sure it will cause other locking issues, you may need to investigate it.

Yes, mddev_supsend needs those sync methods to keep the raid in a quiescent
state. First, it needs to get reconfig_mutex before calling mddev_supsend.
So it can avoid racing with all the actions that want to change the raid. Then
it waits mddev->active_io to 0 which means all submit bio processes stop
submitting io. Then it waits until all internal i/os finish by
pers->quiesce. Last it waits
until the superblock is updated.

>From the logic, it looks safe. By the way, the patch 35bfc52187f
(md: allow metadata update while suspending) which introduces
MD_UPDATING_SB and MD_UPDATING_SB fixes the similar deadlock
problem. So in this problem, it looks like mddev_suspend is a good choice.

As you mentioned, it needs to consider and do tests more because of
the complex sync/locking actions.

Best Regards
Xiao