Re: [PATCH 2/7] multipathd: fix check_path errors with removed map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 19, 2020 at 08:32:34AM +0200, Hannes Reinecke wrote:
> >>
> >>fac68d7 is related to the famous "dm-multipath: Accept failed paths for
> >>multipath maps" patch (e.g.
> >>https://patchwork.kernel.org/patch/3368381/#7193001), which never made
> >>it upstream. SUSE kernels have shipped this patch for a long time, but
> >>we don't apply it any more in recent kernels.
> >>
> >>With this patch, The reinstate_path() failure would occur if multipathd
> >>had created a table with a "disabled" device (one which would be
> >>present in a dm map even though the actual block device didn't exist),
> >>and then tried to reinstate such a path. It sounds unlikely, but it
> >>might be possible if devices are coming and going in quick succession.
> >>In that situation, and with the "accept failed path" patch applied, a
> >>reload makes some sense, because reload (unlike reinstate) would not
> >>fail (at least not for this reason) and would actually add that just-
> >>reinstated path. OTOH, it's not likely that the reload would improve
> >>matters, either. After all, multipathd is just trying to reinstate a
> >>non-existing path. So, I'm fine with skipping the reload.
> >>
> It's actually _not_ unlikely, but a direct result of multipathd listening to
> uevents.
> 
> If you have a map with four paths, and all four of them are going down, you
> end up getting four events.
> And multipath will be processing each event _individually_, causing it to
> generate a reload sequence like:
> 
> [a b c d]
> [b c d]
> [c d]
> [d]
> []
> 
> Of which only the last is valid, as all the intermediate ones contain
> invalid paths.
> 
> And _that_ is the scenario I was referring to when creating the patch.

But even still, I'm not in favor of calling ev_add_path() with the code
as it currently is. In the case you point out, it will presumably fail
when adopt_paths() calls pathinfo(). That will orphan the path, which is
good. But if the map got removed (intentionally) instead of the path,
ev_add_path() will recreate the map, which is bad. Also, it seems more
likely for a path to still exist and be usable while the multipath
device is gone (the bad case), than for the path to pass the checker
function and then disappear before mutipathd tries to reinstate it
immediately afterwards (the good case).  Further, the bad result, where
a deleted multipath device reappears, is actually a problem for users.
Having multipathd take a bit of time and pointless effort to catch up
after multiple changes happen at once doesn't effect users noticeably.

Since the checkerloop thread spends the vast majority of it's not not
checking any specific path, if a path goes away, it is most likely to be
caught by the path_offline() function. I we want to do something to
proactively deal with a path that has been removed, we should do it
there.

-Ben
 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke            Teamlead Storage & Networking
> hare@xxxxxxx                               +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux