Re: [PATCH 3/5] multipathd: make ev_remove_path return success on path removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-05-12 at 16:52 -0500, Benjamin Marzinski wrote:
> On Wed, May 12, 2021 at 08:36:49PM +0000, Martin Wilck wrote:
> > On Wed, 2021-05-12 at 14:53 -0500, Benjamin Marzinski wrote:
> > > On Wed, May 12, 2021 at 11:38:08AM +0000, Martin Wilck wrote:
> > > > On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> > > So AFAICS, the only way for a path not to get removed is if you
> > > succeed
> > > with wait_for_udev or !need_do_map, or if you fail in domap.
> > 
> > Agreed. Let's fix these comments.
> 
> Yep.
>  
> > >  Since wait_for_udev can happen in more situations,
> > > it's a lot harder to say what the right answer is. For
> > > cli_add_path
> > > and
> > > uev_add_path, it seems like we want to know if the path was
> > > really
> > > removed. So returning failure there makes sense.  For
> > > cli_del_path
> > > and
> > > uev_remove_path, it seems like we want to avoid spurious error
> > > messages
> > > when everything went alright and we're just waiting to update the
> > > map.
> > > So returning success makes sense there.
> > > 
> > > Perhaps the answer is to return symbolic values, to describe what
> > > actually happened, rather than success or failure.
> > 
> > This is what I meant. I didn't express myself clearly enough; I
> > just
> > thought that 0 doesn't have to mean "success".
> > 
> 
> Sure. I'll add symbolic returns.
> 
> > 
> > I think the callers just need to know if the path is still
> > referenced
> > somewhere. Acting appropriately is then up to the caller. You just
> > proved that my cases a) and b) are actually equivalent, which is
> > nice.
> > Perhaps we need to introduce another return code indicating that
> > the
> > entire map had been removed (e.g. failure in setup_multipath()).
> 
> The more important return to me seems to be an indication of whether
> the
> remove has been delayed. 

To make sure that we talk about the same thing: when you say "the
remove has been delayed", you mean the case where we just set
INIT_REMOVED, without actually deleting the path from pathvec etc.,
right? This is what I meant with "path is still referenced somewhere"
in my previous post. Ack, this is of course the most important thing
for the callers to know.

>  For uev_remove_path(), you don't want to
> return failure just because the remove has been delayed. Otherwise
> there
> will be spurious error messages in the logs.

With the introduction of INIT_REMOVED, I think we could do away with
these error messages altogether. uev_remove_path() could actually be a
void function. We *know* that at least INIT_REMOVED will be set, which
means that that path will be treated by multipathd as if it didn't
exist. The error message you're talking about would be the highly
unhelpful "uevent trigger error" message - we might was well just ditch
that message. We print much more meaningful messages in
ev_remove_path().

>  cli_del_path is a little
> trickier.  My biggest question with that is whether it would mess
> with
> people's scripts to add a reply message saying what happened. It
> seems
> like it should only fail if domap failed. But it would be nice to
> tell
> the user that the remove has been delayed, or that the map couldn't
> be
> reloaded and was removed as well. 

Same argument here. As far as multipathd is concerned, that path will
be gone. We print "fail" if the domap() call failed, and we should
continue to do so. We could add documentation saying that this means a
"deferred removal".

> 
> > > > However, this goes beyond the purpose of your patch. *If* we
> > > > remove
> > > > the
> > > > map, returning 0 is correct for either a) or b).
> > > > 
> > > > P.S. 2: I wonder if the logic in uev_update_path() is correct.
> > > > Rather
> > > > than calling uev_add_path() after rescan_path() directly, I
> > > > think
> > > > we
> > > > should rather wait for another uevent (and possibly trigger
> > > > another
> > > > "add" event, I don't think "rescan" automatically generates
> > > > one).
> > > > 
> > > 
> > > Yep. You're correct. I'll fix that.
> 
> Actually, I take it back. The code seems to work o.k. as is. The
> uev_update_path() code checks if get_uid() now returns a different
> value, instead of using get_vpd_sgio() like the recheck_wwid code
> does.
> This means that the uid_attribute must have already gotten updated
> when
> rescan_path() is called. So my real question is "is there any real
> benefit to calling rescan_path() at all here". This code seemed to be
> working correctly before we added it, except in the case where
> uid_attribute wasn't getting updated (which recheck_wwid now will
> hopefully catch).

My point was that calling uev_add_path() right after rescan_path() is
wrong, and I still think so - *if* we rescan, we shouldn't look at udev
properties before we can be reasonably sure that the rescan has
completed and has been processed by udev. I agree that calling
rescan_path() in this code path is probably not helpful. 

Let's remove it.

> If there is a benefit, then we have to be careful to only call it
> once.
> Otherwise, we could get stuck in an endless loop where we trigger an
> add
> uevent, which in turn triggers another add uevent, and so on.

I don't see that risk, because uev_update_path() is only called for
"change" uevents, not "add".

Regards,
Martin

> 
> -Ben
>  
> > > -Ben
> > > 
> > > > 
> > > > > ---
> > > > >  multipathd/main.c | 6 ++++--
> > > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/multipathd/main.c b/multipathd/main.c
> > > > > index 6090434c..4bdf14bd 100644
> > > > > --- a/multipathd/main.c
> > > > > +++ b/multipathd/main.c
> > > > > @@ -1284,7 +1284,7 @@ ev_remove_path (struct path *pp, struct
> > > > > vectors *
> > > > > vecs, int need_do_map)
> > > > >  
> > > > >                         strlcpy(devt, pp->dev_t,
> > > > > sizeof(devt));
> > > > >                         if (setup_multipath(vecs, mpp))
> > > > > -                               return 1;
> > > > > +                               return 0;
> > > > >                         /*
> > > > >                          * Successful map reload without this
> > > > > path:
> > > > >                          * sync_map_state() will free it.
> > > > > @@ -1304,8 +1304,10 @@ out:
> > > > >         return retval;
> > > > >  
> > > > >  fail:
> > > > > +       condlog(0, "%s: error removing path. removing map
> > > > > %s",
> > > > > pp->dev,
> > > > > +               mpp->alias);
> > > > >         remove_map_and_stop_waiter(mpp, vecs);
> > > > > -       return 1;
> > > > > +       return 0;
> > > > >  }
> > > > >  
> > > > >  static int
> > > 
> > > --
> > > dm-devel mailing list
> > > dm-devel@xxxxxxxxxx
> > > https://listman.redhat.com/mailman/listinfo/dm-devel
> > > 
> 


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux