Re: [PATCH 2/7] multipathd: fix check_path errors with removed map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/19/20 1:17 AM, Benjamin Marzinski wrote:
On Thu, Jun 18, 2020 at 07:34:38PM +0000, Martin Wilck wrote:
On Wed, 2020-06-17 at 19:24 -0500, Benjamin Marzinski wrote:
If a multipath device is removed during, or immediately before the
call
to check_path(), multipathd can behave incorrectly. A missing
multpath
device will cause update_multipath_strings() to fail, setting
pp->dmstate to PSTATE_UNDEF.  If the path is up, this state will
cause
reinstate_path() to be called, which will also fail.  This will
trigger
a reload, restoring the recently removed device.

If update_multipath_strings() fails because there is no multipath
device, check_path should just quit, since the remove dmevent and
uevent
are likely already queued up. Also, I don't see any reason to reload
the
multipath device if reinstate fails. This code was added by
fac68d7a99ef17d496079538a5c6836acd7911ab, which clamined that
reinstate
could fail if the path was disabled.  Looking through the current
kernel
code, I can't see any reason why a reinstate would fail, where a
reload
would help. If the path was missing from the multipath device,
update_multipath_strings() would already catch that, and quit
check_path() early, which make more sense to me than reloading does.

fac68d7 is related to the famous "dm-multipath: Accept failed paths for
multipath maps" patch (e.g.
https://patchwork.kernel.org/patch/3368381/#7193001), which never made
it upstream. SUSE kernels have shipped this patch for a long time, but
we don't apply it any more in recent kernels.

With this patch, The reinstate_path() failure would occur if multipathd
had created a table with a "disabled" device (one which would be
present in a dm map even though the actual block device didn't exist),
and then tried to reinstate such a path. It sounds unlikely, but it
might be possible if devices are coming and going in quick succession.
In that situation, and with the "accept failed path" patch applied, a
reload makes some sense, because reload (unlike reinstate) would not
fail (at least not for this reason) and would actually add that just-
reinstated path. OTOH, it's not likely that the reload would improve
matters, either. After all, multipathd is just trying to reinstate a
non-existing path. So, I'm fine with skipping the reload.

It's actually _not_ unlikely, but a direct result of multipathd listening to uevents.

If you have a map with four paths, and all four of them are going down, you end up getting four events. And multipath will be processing each event _individually_, causing it to generate a reload sequence like:

[a b c d]
[b c d]
[c d]
[d]
[]

Of which only the last is valid, as all the intermediate ones contain invalid paths.

And _that_ is the scenario I was referring to when creating the patch.

Cheers,

Hannes
--
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@xxxxxxx                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux