On Tue, 2020-08-18 at 21:09 +0800, lixiaokeng wrote: > There may be a race window here: > 1. all paths gone, causing map flushed both from multipathd and > kernel > 2. paths regenerated, causing multipathd creating the map again. > > 1 will generate a remove uevent which can be handled after 2, so we > can > disable queueing for the map created by 2 here temporarily and let > the > change uevent (generated by 2) calling uev_add_map->setup_multipath > to set queueing again. This can prevent the deadlock in this race > window. > > The possible deadlock is: all udevd workers hangs in devices because > of > queue_if_no_path, so no udevd workers can handle new event and since > multipathd will remove the map, the checkerloop cannot check this > map's > retry tick timeout and cancel the io hang which makes the udevd > worker > hang forever. multipathd cannot receive any uevent from udevd because > all udevd workers hang there so the map cannot be recreated again > which > makes a deadlock. > > Signed-off-by: Lixiaokeng@xxxxxxxxxx As noted in my other reply, I don't fully understand how this deadlock actually came to pass. But disabling queuing on a map which can't be in use at the given point in time can't do no harm. So: Reviewed-by: Martin Wilck <mwilck@xxxxxxxx> > --- > multipathd/main.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/multipathd/main.c b/multipathd/main.c > index baa18183..d7e20a10 100644 > --- a/multipathd/main.c > +++ b/multipathd/main.c > @@ -798,6 +798,7 @@ uev_remove_map (struct uevent * uev, struct > vectors * vecs) > goto out; > } > > + dm_queue_if_no_path(alias, 0); > remove_map_and_stop_waiter(mpp, vecs); > out: > lock_cleanup_pop(vecs->lock); -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel