On Mon, Jun 6, 2022 at 11:08 AM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote: > > > > >> + } > > Maybe instead of the ugly boolean argument we could pull > > md_unregister_thread() into all the callers and explicitly unlock in the > > single call site that needs it? > > Sounds good, let me revert previous commit then pull it. > Hi all Now md_reap_sync_thread is called by __md_stop_writes, action_store and md_check_recovery. If we move md_unregister_thread out of md_reap_sync_thread and unlock reconfig_mutex before calling md_unregister_thread, it means we break the atomic action for these three functions mentioned above. Not sure if it can introduce other problems, especially for the change of md_check_recovery. How about suspend I/O when changing the sync action? It should not hurt the performance, because it only interrupts the sync action and flush the in memory stripes. For this way, it needs to revert 7e6ba434cc60 (md: don't unregister sync_thread with reconfig_mutex held) and changes like this: diff --git a/drivers/md/md.c b/drivers/md/md.c index ce9d2845d3ac..af28cdeaf7e1 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4827,12 +4827,14 @@ action_store(struct mddev *mddev, const char *page, size_t len) clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && mddev_lock(mddev) == 0) { + mddev_suspend(mddev); if (work_pending(&mddev->del_work)) flush_workqueue(md_misc_wq); if (mddev->sync_thread) { set_bit(MD_RECOVERY_INTR, &mddev->recovery); md_reap_sync_thread(mddev); } + mddev_resume(mddev); mddev_unlock(mddev); } } else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) mddev_suspend can stop the I/O and the superblock can be updated too. Because MD_ALLOW_SB_UPDATE is set. So raid5d can go on handling stripes. Best Regards Xiao