Multipath bootup failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Booting a machine with a multipathed kpartx root device fails for me
using the fedora rawhide multipath packages, which are based on the
0.9.7 release. Using LVM on top works. The issue is that when the root
device is directly on a partition, dracut finds it on one of the path
devices, and starts using that. If multipathd isn't running when the
uevent for that path device is processed, it won't be claimed by
multipath (starting in 0.9.7), since there is no multipathd.socket in
the initramfs and no systemd_service_enabled().  Afterwards, multipathd
creates a multipath device on top of the device, claims it, and removes
the partitions. If this happens while dracut is attempting to mount the
root device, the boot fails. In practice, it usually failed for me.

Reverting 6fad1464 ("libmpathutil: remove systemd_service_enabled()")
resolves the problem. When I tried to add 

Before=dracut-pre-mount.service

to dracut's version of multipathd.service instead, it works over 95% of
the time, but it still occasionally fails. The issue is that even though
multipathd will creates the multipath device before before signaling
that it has started up, meaning that dracut won't start working towards
mounting the root device until after the multipath device exists, dracut
won't know to not use the underlying device partition until it processes
the uevents that get triggered by multipathd creating the device. And it
won't be able to use the kpartx device until in processes the uevents
that get triggered by kpartx running when processing the multipath
device uevents. Depending on how quickly dracut processes these events
relative to the rest of the bootup work, it can still hang. I've tested
adding

Before=systemd-udev-trigger.service

to multipathd.sevice with no failures so far.  This requires fixing
multipathd-configure.service, so that there aren't any dependency
conflicts, but that should happen anyway.  I need to talk to the CoreOS
people who added this, but I think the only necessary dependency for
multipathd-configure.service to come after is

After=dracut-cmdline.service

With this, I think that multipathd should always be running before
device uevents get processed, but perhaps it needs to be before
systemd-udevd.service instead.

If it's not possible to guarantee that multipathd has started before we
process uevents so that we always claim the path devices as soon as they
appear, then to close this race window, we need to either wait after
multipathd starts for all the uevents to settle (and I don't think we
want to get back into the business of relying on udev-settle), or to go
back to some method of making multipath able to claim devices before
multipathd starts. 

Or we do something more clever. Thoughts?

-Ben





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux