On Tue, Mar 22, 2022 at 4:08 AM Martin Wilck <mwilck@xxxxxxxx> wrote: > > On Mon, 2022-03-21 at 19:34 -0500, Benjamin Marzinski wrote: > > On Fri, Mar 18, 2022 at 5:33 PM <mwilck@xxxxxxxx> wrote: > > > > > > From: Martin Wilck <mwilck@xxxxxxxx> > > > > > > If multipathd needs to delay reconfigure() because it's waiting for > > > a map > > > creation uevent, it can happen that child() isn't woken up if the > > > map > > > being waited for is removed before the uevent arrives. Fix this by > > > unblocking reconfigure() with the remove_map_callback() function. > > > > > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx> > > > --- > > > multipathd/main.c | 11 +++++++++++ > > > 1 file changed, 11 insertions(+) > > > > > > diff --git a/multipathd/main.c b/multipathd/main.c > > > index f3b8eb4..4721cd8 100644 > > > --- a/multipathd/main.c > > > +++ b/multipathd/main.c > > > @@ -319,6 +319,17 @@ static bool unblock_reconfigure(void) > > > return was_delayed; > > > } > > > > > > +/* > > > + * Make sure child() is woken up when a map is removed that > > > multipathd > > > + * is currently waiting for. > > > + * Overrides libmultipath's weak symbol by the same name > > > + */ > > > +void remove_map_callback(struct multipath *mpp) > > > +{ > > > + if (mpp->wait_for_udev > 0) > > > > Is there a reason why you don't decrement wait_for_udev, and check > > need_to_delay_reconfig() like in ev_add_map()? I realize that it > > doesn't hurt anything to unblock the reconfigure even if there are > > other devices that need a delay. The main thread will notice that it > > still needs to delay. I'm just wondering why we do it differently > > here? > > The main reason was to keep it simple. need_to_delay_reconfig() needs > to be passed a "vecs" pointer, and requires the vecs lock to be > held. remove_map() is deep down in the call stack and has lots of > direct and indirect callers. It can be called with mpvec == NULL and > (in theory) also with pathvec == NULL, in which case it simply frees > the memory obtained by the map, without unlinking itself or its members > from any vectors. In that case it *could* be called without the vecs > lock held (AFAICS, that's not the case today, but the function could be > used this way, e.g. in an error handling code path). > > I thought it was easier and safer not to have to assert that every > current and future caller holds the vecs lock, in particular because > this is called indirectly from libmultipath, the function that grabs > the lock is usually high up in the call stack. > > I had indeed pondered whether I should remove the call to > need_to_delay_reconfig() from the ev_add_map(), too. I decided against > it, because I realized that waking up child() for nothing is not > cheap,as child() needs to grab the vecs lock just for calling > need_to_delay_reconfig(). We should avoid this for the common case of > an uevent which we are waiting for. > The remove_map() code path, OTOH, is a corner case (map being removed > while in the process of being added). We need to cover it, but we know > that this code path will be rarely executed in practice, and thus > unlikely to cause vecs lock contention. > > I hope this makes sense. Yeah. Whenever a map is getting removed from vecs->mpvec, the vecs lock better be held. But since a map could be getting removed from some other vector of maps, we can't use (mpvec != NULL) to be sure that the vecs lock must be held. Reviewed-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > > Regards > Martin > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel