Re: [PATCH v2] multipath -u: test socket connection in non-blocking mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2019-04-24 at 11:07 +0200, Martin Wilck wrote:
> Since commit d7188fcd "multipathd: start daemon after udev trigger",
> multipathd startup is delayed during boot until after "udev settle"
> terminates. But "multipath -u" is run by udev workers for storage
> devices,
> and attempts to connect to the multipathd socket. This causes a start
> job
> for multipathd to be scheduled by systemd, but that job won't be
> started
> until "udev settle" finishes. This is not a problem on systems with
> 129 or
> less storage units, because the connect() call of "multipath -u" will
> succeed anyway. But on larger systems, the listen backlog of the
> systemd
> socket can be exceeded, which causes connect() calls for the socket
> to
> block until multipathd starts up and begins calling accept(). This
> creates
> a deadlock situation, because "multipath -u" (called by udev workers)
> blocks, and thus "udev settle" doesn't finish, delaying multipathd
> startup. This situation then persists until either the workers or
> "udev
> settle" time out. In the former case, path devices might be
> misclassified
> as non-multipath devices by "multipath -u".
> 
> Fix this by using a non-blocking socket fd for connect() and
> interpret the
> errno appropriately.
> 
> This patch reverts most of the changes from commit 8cdf6661
> "multipath:
> check on multipathd without starting it". Instead, "multipath -u"
> does
> access the socket and start multipath again (which is what we want
> IMO),
> but it is now able to detect and handle the "full backlog" situation.
> 
> Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
> 
> V2:
> 
> Use same error reporting convention in __mpath_connect() as in
> mpath_connect() (Hannes Reinecke). We can't easily change the latter,
> because it's part of the "public" libmpathcmd API. 

FTR, our customer reported that this patch fixed his problem.

@Ben, I'd be grateful if you could try it (or have the user try it)
in your problem case as well.

-- 
Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux