On Fri, Apr 28, 2023 at 12:13 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > commit 5792a2856a63 ("[PATCH] md: avoid a deadlock when removing a device > from an md array via sysfs") delay the deleting of rdev, however, this > introduce a window that rdev can be added again while the deleting is > not done yet, and sysfs will complain about duplicate filename. > > Follow up patches try to fix this problem by flush workqueue, however, > flush_rdev_wq() is just dead code, the progress in > md_kick_rdev_from_array(): > > 1) list_del_rcu(&rdev->same_set); > 2) synchronize_rcu(); > 3) queue_work(md_rdev_misc_wq, &rdev->del_work); > > So in flush_rdev_wq(), if rdev is found in the list, work_pending() can > never pass, in the meantime, if work is queued, then rdev can never be > found in the list. > > flush_rdev_wq() can be replaced by flush_workqueue() directly, however, > this approach is not good: > - the workqueue is global, this synchronization for all raid disks is > not necessary. > - flush_workqueue can't be called under 'reconfig_mutex', there is still > a small window between flush_workqueue() and mddev_lock() that other > context can queue new work, hence the problem is not solved completely. > > sysfs already have apis to support delete itself through writer, and > these apis, specifically sysfs_break/unbreak_active_protection(), is used > so support deleting rdev synchronously. Therefore, the above commit can be > reverted, and sysfs duplicate filename can be avoided. > > A new mdadm regression test [1] is proposed as well. > > Link: https://lore.kernel.org/linux-raid/20230428062845.1975462-1-yukuai1@xxxxxxxxxxxxxxx/ > Fixes: 5792a2856a63 ("[PATCH] md: avoid a deadlock when removing a device from an md array via sysfs") > Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> > --- > drivers/md/md.c | 85 +++++++++++++++++++++++++------------------------ > drivers/md/md.h | 8 +++++ > 2 files changed, 51 insertions(+), 42 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 0a4a10d4c0e0..e1bc223d58b3 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -83,12 +83,12 @@ static struct module *md_cluster_mod; > static DECLARE_WAIT_QUEUE_HEAD(resync_wait); > static struct workqueue_struct *md_wq; > static struct workqueue_struct *md_misc_wq; > -static struct workqueue_struct *md_rdev_misc_wq; > > static int remove_and_add_spares(struct mddev *mddev, > struct md_rdev *this); > static void mddev_detach(struct mddev *mddev); > static void md_wakeup_thread_directly(struct md_thread __rcu *thread); > +static void export_rdev(struct md_rdev *rdev); There is some conflict with md_wakeup_thread_directly(). I guess you run format-patch on top of the "protect md_thread" set which is not merged yet (and we are waiting for v8 of it?). Please redo the patch on top of the latest md-next. I just updated it. Thanks, Song