Fwd: multipath-tools: incompatibility with systemd (or udev) >= 213

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alasdair,

did lvm2 also need to workaround this bad inter-flocking behaviour with systemd/udev ?
If so, can you share advice on a possible fix ?

FYI, multipathd acquires non-blocking and exclusive locks on each multipath path before creating the multipath dm map, and releases the locks immediately after.

Best regards,
Christophe Varoqui

---------- Forwarded message ----------
From: Alexander E. Patrakov <patrakov@xxxxxxxxx>
Date: Thu, Oct 30, 2014 at 4:01 PM
Subject: multipath-tools: incompatibility with systemd (or udev) >= 213
To: christophe.varoqui@xxxxxxxxxxx


Hello.

Some time ago, I have complained to the systemd-devel that my computer (which, at that time, had multipath-tools installed needlessly) boots unreliably:

http://lists.freedesktop.org/archives/systemd-devel/2014-September/022812.html

Further investigation of this issue:

http://lists.freedesktop.org/archives/systemd-devel/2014-October/024572.html

I have CC:ed dm-devel (https://www.redhat.com/archives/dm-devel/2014-October/msg00197.html) when I found the root cause of the unreliable boot, but so far received no reaction. Hence this poke. The bug is quite critical, because mere installation of your package and regenerating the dracut-based initramfs must not lead to unbootable system, but it does.

Basically, the trouble maker is multipathd. As a part of its operation, it locks various block devices (including /dev/sda) using the flock system call. Such locking is incompatible with udevd from systemd >= 213, because systemd-udevd locks the device itself for rule processing, and does not process rules if the device is already locked by someone else. Arguably, it is a bug in systemd that it does not retry later, but who am I to say so? Still, lack of rule processing means that symlinks in /dev/disk/by-* pointing to the partitions are not created, and thus the root device is never found.

Please either persuade systemd maintainers to stop locking block devices because this API is already used for a different purpose, or persuade them to retry rule processing (they can probably do so when they detect via inotify or similar means that the device has been closed), or stop locking block devices in multipathd (i.e. lock something else like zero-length files in /run/lock).

By the way, I suspect that this lock contention also has impact on multipathd itself, potentially causing it to disregard some paths because their block devices were locked by systemd-udevd when multipathd attempted to look at them. I am not an expert here, because I don't have any hardware that needs multipath.

--
Alexander E. Patrakov

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux