Hi Lixiaokeng, On Tue, 2020-08-18 at 21:09 +0800, lixiaokeng wrote: > There may be a race window here: > 1. all paths gone, causing map flushed both from multipathd and > kernel > 2. paths regenerated, causing multipathd creating the map again. > > 1 will generate a remove uevent which can be handled after 2, so we > can > disable queueing for the map created by 2 here temporarily and let > the > change uevent (generated by 2) calling uev_add_map->setup_multipath > to set queueing again. This can prevent the deadlock in this race > window. > > The possible deadlock is: all udevd workers hangs in devices because > of > queue_if_no_path, so no udevd workers can handle new event and since > multipathd will remove the map, the checkerloop cannot check this > map's > retry tick timeout and cancel the io hang which makes the udevd > worker > hang forever. multipathd cannot receive any uevent from udevd because > all udevd workers hang there so the map cannot be recreated again > which > makes a deadlock. > > Signed-off-by: Lixiaokeng@xxxxxxxxxx A map which is removed and not yet re-added again (as far as udev is concerned) doesn't need to queue because it can't possibly be in use. So I think the patch can't hurt in other scenarios, at it makes sense in the situation you describe. However I have a few questions. Have you observed this, or is it theory? I'm wondering: After 2) there should be some paths again, so why would the udev workers hang? I guess this could happen if the regenerated paths all in failed / standby state, is that what you mean? Note also that we set DM_NOSCAN in the udev rules when there are no usable paths, so udev workers would only hang if the last path fails / is removed after the "multipath -U" check. You've certainly hit a weak spot here, and you've nicely described a potential problem scenario. The delayed processing of uevents that multipathd triggered itself is a recurring cause of headache. Regards, Martin > --- > multipathd/main.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/multipathd/main.c b/multipathd/main.c > index baa18183..d7e20a10 100644 > --- a/multipathd/main.c > +++ b/multipathd/main.c > @@ -798,6 +798,7 @@ uev_remove_map (struct uevent * uev, struct > vectors * vecs) > goto out; > } > > + dm_queue_if_no_path(alias, 0); > remove_map_and_stop_waiter(mpp, vecs); > out: > lock_cleanup_pop(vecs->lock); -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel