On Tue, Jun 13, 2023 at 6:15 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2023/06/13 22:43, Xiao Ni 写道: > > > > 在 2023/5/29 下午9:20, Yu Kuai 写道: > >> From: Yu Kuai <yukuai3@xxxxxxxxxx> > >> > >> Currently, for idle and frozen, action_store will hold 'reconfig_mutex' > >> and call md_reap_sync_thread() to stop sync thread, however, this will > >> cause deadlock (explained in the next patch). In order to fix the > >> problem, following patch will release 'reconfig_mutex' and wait on > >> 'resync_wait', like md_set_readonly() and do_md_stop() does. > >> > >> Consider that action_store() will set/clear 'MD_RECOVERY_FROZEN' > >> unconditionally, which might cause unexpected problems, for example, > >> frozen just set 'MD_RECOVERY_FROZEN' and is still in progress, while > >> 'idle' clear 'MD_RECOVERY_FROZEN' and new sync thread is started, which > >> might starve in progress frozen. A mutex is added to synchronize idle > >> and frozen from action_store(). > >> > >> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> > >> --- > >> drivers/md/md.c | 5 +++++ > >> drivers/md/md.h | 3 +++ > >> 2 files changed, 8 insertions(+) > >> > >> diff --git a/drivers/md/md.c b/drivers/md/md.c > >> index 23e8e7eae062..63a993b52cd7 100644 > >> --- a/drivers/md/md.c > >> +++ b/drivers/md/md.c > >> @@ -644,6 +644,7 @@ void mddev_init(struct mddev *mddev) > >> mutex_init(&mddev->open_mutex); > >> mutex_init(&mddev->reconfig_mutex); > >> mutex_init(&mddev->delete_mutex); > >> + mutex_init(&mddev->sync_mutex); > >> mutex_init(&mddev->bitmap_info.mutex); > >> INIT_LIST_HEAD(&mddev->disks); > >> INIT_LIST_HEAD(&mddev->all_mddevs); > >> @@ -4785,14 +4786,18 @@ static void stop_sync_thread(struct mddev *mddev) > >> static void idle_sync_thread(struct mddev *mddev) > >> { > >> + mutex_lock(&mddev->sync_mutex); > >> clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); > >> stop_sync_thread(mddev); > >> + mutex_unlock(&mddev->sync_mutex); > >> } > >> static void frozen_sync_thread(struct mddev *mddev) > >> { > >> + mutex_init(&mddev->delete_mutex); > > > > > > typo error? It should be mutex_lock(&mddev->sync_mutex); ? > > > > Yes, and thanks for spotting this, this looks like I did this while > rebasing. I fixed this one and applied the set to md-next. Thanks, Song -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel