Re: LVM and DM mounting process: questions about the interactions between systemd, dracut and blk-availability.service

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Greetings,
>
> I've been using Fedora with a "simple" LVM setup with no problems for the
> least 3 years. Recently I've decided to set up my laptop with LVM on top of LUKS in > F23. While migration from the previous setup was relatively painless, I've been noting > issues with shutdown: I consistently observe logs stating failure to properly deactivate > the logical volumes and the LUKS device (as reported by others in bug 1097322 [1], which > unfortunately has been closed due to EOL). I don't know if they are spurious, which > led me to investigate a bit about how things work, and I'm failing to make sense of
> it.
>
> I've noticed the existence of `blk-availability.service` in systemd. It's a > service that does nothing on start, and calls the `blkdeactivate` executable on system > shutdown, after the "special block-device" services (LVM, iSCSI, etc) have > stopped. `blkdeactivate` is called with the option to umount devices in use. But I > don't see how it can ever succeed for the system root: other services will still be > shutting down, and systemd's unmounting phase will not have been reached yet. The same > might hold true for non-system-root mounts as well, if services that depend on them are in
> the same situation.
>
> My understanding was that special block-device handling was a task performed by dracut in > the initramfs. It does have a shutdown hook called `dm-shutdown.sh` that uses the > `dmsetup` executable to remove any device-mapper devices still enabled. I don't see > any shutdown hooks for the LVM module, so I assume the DM module also takes care of them.
> Is my understanding correct?
>
> Wouldn't it be possible to replace the custom DM hook with a call to `blkdeactivate`, > and remove the `blk-availability` service from the "normal root" shutdown? Could > that possibly work better than the current setup, since `blkdeactivate` claims to be > capable to handle nested device-mapper setups, and to be able to use LVM commands in a > more intelligent way (for example, deactivating whole volume groups at once)? > Shouldn't `blkactivate` at least be told not to unmount the root, as it will always
> fail?
>
> Apologies if I said anything egregiously wrong, and I'd be glad to be corrected in
> that case.
>
> [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1097322

I did some extra investigation which made things clearer for me, but still seems to show some pain points.

I found out the cause of LVM not being deactivated: firewalld's service stop removes some modules from the kernel (specifically nf_conntrack), and that hangs, leaving an immortal userland process (rmmod) that keeps the root mount alive (which I reported as bug 1294415 [1]).

The service stop never finishes until the shutdown timeout is reached, when systemd attempts to shutdown uncleanly (TERM-ing and KILL-ing everything). The LVM deactivation rightfully fails (as there is still a mounted filesystem on the LV), and by extension, so does the LUKS deactivation.

blkdeactivate actually does already exclude a bunch of hardcoded mountpoints (/ /boot /lib /lib64 /bin /sbin /var /usr /usr/lib /usr/lib64 /usr/sbin /usr/bin). It still seems to me like an insufficient solution: it's perfectly possible, if not likely, that an userland process part of a service that is not yet finished holds an open file in a non-excluded mountpoint. That will cause blkdeactivate to fail at it's job at unmounting (as it is currently called in blk-availability.service) and device deactivation.

Unmounting should be systemd's job, as it holds the dependency tree and can know with reasonable certainty that all userland processes have been terminated if at all possible. blkdeactivate could then be run from the initramfs root without doing any unmounting, as it will have already been done if possible.

I also found a strange interaction that I suspect could cause some failures in unusual situations. Dracut's current shutdown procedure calls to /usr/lib/dracut/dracut-initramfs-restore through the stop command of dracut-shutdown.service. It prepares a new system root for late shutdown, that systemd will pivot to after being done. The script attempts to mount /boot read-only unconditionally, which is normally fine, as the service depends on boot.mount, and will only be stopped after it is unmounted. But if the unmounting fails and /boot is still up for some reason, the script will fail completely and there will be no pivot at all, possibly leaving activated block devices at poweroff time.

I don't know if any of these two suggestions will solve the problems reported in the bug I originally linked, but they seem like worthwhile improvements either way. I would be thankful if anyone with good knowledge of Dracut could offer some insight on it.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1294415

Regards,
Daniel

--
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
http://lists.fedoraproject.org/admin/lists/devel@xxxxxxxxxxxxxxxxxxxxxxx



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux