Re: [PATCH 0/3] New approach at handling changed WWIDs

Martin Wilck <mwilck@xxxxxxxx> · Tue, 19 Mar 2019 19:44:09 +0100

On Tue, 2019-03-19 at 12:11 -0500, Benjamin Marzinski wrote:
> On Mon, Mar 18, 2019 at 01:12:32PM +0100, Martin Wilck wrote:
> > Hi Ben, hi Christophe,
> > 
> > after reviewing Ben's last patch set, I've been pondering over the
> > handling of changed WWIDs, which seems to have become a bit too
> > clever
> > and complex for my taste, and I came up with this new approach
> > instead.
> > 
> > TL;DR: instead of treating paths with changed WWIDs as faulty, act
> > as if the path was removed, and another path was then added and
> > obtained the device ID of the removed path.
> > 
> > Long version:
> 
> Mine is even longer. Sorry :(

NP, but you'll forgive me for shortening it a a bit for my reply :-)

> While I agree that a path's wwid changing to a different non-null
> wwid
> is always bad, and removing and re-adding the path seems perfectly
> sensible for that scenario, I've can think of two other case where
> the
> wwid can change to null.  d) a change event happens while the path is
> down.  e) there is a uevent storm and udev times out while handling a
> change event.
> 
> d) and e) seem to me like by far the most likely scenarios, so doing
> something suboptimal for them, in order to optimize the very unlikely
> scenarios seems wrong.

You've worked on this far more than I did, so I trust your judgement.

One remark: With the kernel exporting the VPD pages in sysfs
and my late "multipath -u" patch series, udev should be able to process
uevents for path devices without a single actual device IO. (*) 
While there are still possible causes for udev processing to get stuck,
this would make it much less likely.

> > What remains to be considered is what Ben was dealing with in his
> > latest set,
> > a permanent or temporary failure to retrieve the WWID, resulting in
> > a 0-length
> > WWID to be returned. IMO it's actually the best thing about this
> > new approach
> > that this doesn't need to be special-cased. The path initialization
> > logic
> > that we already have would take care of it using the
> > INIT_MISSING_UDEV logic.
> > Either the WWID would be successfully retrieved eventually, in
> > which case the
> > path would be re-added to the previous map, or (depending on
> > configuration)
> > added to a new map or left alone. Or the failure is permanent, in
> > which case
> > multipathd would eventually give up and orphan the path. AFAICS
> > this would
> > be the "right thing" to do in all these different cases, without
> > any
> > additional logic.
> 
> This is where I don't fully agree. It means that we can be stuck
> waiting
> for a uevent to occur when the device has come back up, instead of
> just
> restoring it.  I would really like to keep as little code as possible
> that needs to get run before we restore a working path, because that
> could be, for instance, the last path to your root filesystem, and
> you
> can't touch any non-cached files on it, until you restore that path.
> 
> Even aside from the extreme case, multipathd is doing a lot of extra
> work, and quite possibly adding a lot of confusing log messages,
> simply
> because of an unfortunately timed change event.
>  
> > Note also that if a "reconfigure" was carried out in the presence
> > of
> > paths with changed WWID, the final outcome would likely be the same
> > that
> > my patch now achieves without "reconfigure".
> 
> Yeah. I just checked, and this is very broken, and something needs to
> be
> done to fix it.  If a device gets a change event while it's down, it
> will no longer have the udev properties necessary to not be
> blacklisted,
> so the device gets blacklisted, and ignored.  Even after it comes
> back
> up and gest another change event to restore these values, multipathd
> still ignores it, because the device was blacklisted during its add
> event.

I wasn't aware of that.

> However, if you just set the wwid of a path to NULL and run
> reconfigure,
> it will stay added to the map, and get it's wwid restored to the
> multipath device's wwid, because configure() calls map_discovery(),
> which ends up calling disassemble_map(). This will set a NULL path
> wwid
> to the multipath device wwid, since disassemble_map() assumes that
> NULL
> wwids just mean that multipathd couldn't get the wwid temporarily.

My thinking was that reconfigure calls path_discovery(), which would
call pathinfo() on all available paths again. I understand now - if
there was an error processing the last uevent for a device, the
information in the udev DB will still be incomplete, and
path_discovery() will obtain the same broken information, possibly
blacklisting the device. I thought the WWID might then be restored by
the INIT_MISSING_UDEV logic, but disassemble_map() would almost
certainly come first. I'm not sure I like it that disassemble_map()
tries to fill in path information. It's ok to do this for paths
that haven't been detected at all (pp == NULL case) but in case of
paths that are present in the udev db, I'm uncertain.

This would also affect your treatment of "WWID changed to zero", no?
Any call to setup_multipath() or update_multipath() would call
disassemble_map(), which would possibly mess with the path's wwid...

> 
> > I case I've come to the wrong conclusions because I overlooked
> > something
> > essential, please tell me.
> > 
> > Going one step futher, I've actually come to think differently
> > about the
> > "fallback logic" for the case that no WWID can be obtained from
> > udev. I
> > believe now that such fallback logic should _not_ be used. The
> > point is not
> > to derive _some_ WWID, but _the right one_, and that's udev's job.
> > But udev
> > can be customized in complex ways that multipathd has no idea
> > about.
> > In the worst case, we'd receive some WWIDs from udev and some from
> > our
> > own logic, and combine paths into a map which wouldn't acutally
> > belong
> > together. Therefore I vote for ripping out the fallback logic
> > altogether
> > and depend on udev exclusively for WWID generation. I haven't
> > included this
> > in the current patch set in order not to make it too controversial.
> > 
> > The only purpose for the fallback logic that I could see is to
> > provide
> > a configuration option to force multipathd to _always_ determine
> > the
> > WWID by itself, ignoring udev device properties.
> 
> I understand your concern.  There are currently safeguards (obviously
> not foolproof ones). The fallback wwid has to match the existing
> wwid,
> otherwise the path gets disabled. It seems very unlikely that a path
> configured to use the default uid_attribute, which switched to the
> fallback method and still got the same wwid, should actually have a
> different udev wwid than the one it previously had and that the
> fallback
> method still has.  Also, once a path starts using the fallback
> method,
> it clears uid_attribute, so that it keeps using the fallback method.
> 
> But to get back to dealing with changed wwids, I agree with removing
> an
> re-adding a device if it's wwid has changed. So, leaving that aside,
> the
> sticky question is "what to do with a path that has a NULL wwid?"
> 
> There are a number of options:
> 
> 1. Ignore the NULL wwid. This is what my patch 07 does, and I'd like
> to
> make the case for this yet again. 

You'd reverted this change in your patch 12, which confused me quite a
bit. What is your preference, really?

> I feel pretty sure that a NULL wwid on
> a change event is almost always going to occur because that change
> event
> happened when the path was down. If that's not the reason, then I
> believe it is either your reasons a) or c) from above. a) is not
> fixable. c) is not really our problem. It's a service to try saving
> the
> sysadmin from their own mistakes. But there is no way to make it
> foolproof. It is always possible for the device to get remapped and
> start getting IO before multipathd ever gets the uevent. The best we
> can
> guarantee is that we will eventually detect when this happens, report
> it, and stop it from doing more damage.  Given this, I think it's
> perfectly reasonable to just let NULL wwids slide, because that's
> ususally the right thing to do, and everything else makes the common
> case work worse, while still not providing a guarantee that the worst
> case is avoided. This is still my preferred solution.
> 
> 2. Fail the path on a NULL wwid. In the checker loop, when the path
> is
> up again, re-attempt to get the wwid.  However, since the udev
> information we have won't contain the new wwid, use the fallback
> information. This is my patch 12, and I agree that there could be
> issues
> with the fallback wwid being determined differently, although like I
> mentioned above, multipathd tries to deal with them.
> 
> 3. Fail the path on a NULL wwid. In the checker loop, when the path
> is
> up again, trigger a change uevent to get the new information. This
> delays restoring a path once it's up, and increases the number of
> things
> outside of multipathd's control that could go wrong which would keep
> the
> path from getting restored (udev timeouts, inability for udev to
> complete, because access to a necessary device is down).
> 
> 4. Some sort of hybrid of 2 and 3, where devices can either get their
> wwid from udev or directly from the device, and depending on how they
> get their wwid, by necessity they do different things. This is
> workable
> but a lot of code for something that is just a best-effort attempt to
> protect sysadmins from themselves.

You have argued well that 2. is best for re-acquiring a reasonable WWID
quickly. Anyway, I don't think it's healthy if the udev information is
wrong for a device, and remains wrong. Thus, IMO, another change event
should be triggered anyway, hoping to restore a state where udev
properties correctly identify the device. If the broken ID is really
just the result of bad timing as you suggest, that should likely be
successful. Some care may be needed to make sure that we don't confuse
ourselves with that self-generated uevent.

> 4. Completely remove the path, and try to re-add it.  This method
> will
> likely cause at least as much delay as 3, with an even larger chance
> that something goes wrong and keeps access from being restored.
> That's
> completely fine if the wwid has changed to something else, but I
> really
> don't like having to do all that work just because a change event
> occurred on a device that was down.
> 
> Thoughts?

Let's go for the removal and re-addition for the non-NULL changed WWID
case, and use the quick path for the NULL-WWID case (maybe we could
even try the "fallback" immediately in uev_update_path()?).

You have a more complete understanding of the whole issue than me, so I
trust that you'll come up with a v3 series addressing all these issues.

Regards
Martin

(*) This works only if the tool used to retrieve the VPDs support
reading from sysfs, of course, as sg_inq does (scsi_id does not,
currently).

-- 
Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel