> Greetings,
>
> I've been using Fedora with a "simple" LVM setup with no problems for the
> least 3 years. Recently I've decided to set up my laptop with LVM on
top of LUKS in
> F23. While migration from the previous setup was relatively painless,
I've been noting
> issues with shutdown: I consistently observe logs stating failure to
properly deactivate
> the logical volumes and the LUKS device (as reported by others in bug
1097322 [1], which
> unfortunately has been closed due to EOL). I don't know if they are
spurious, which
> led me to investigate a bit about how things work, and I'm failing to
make sense of
> it.
>
> I've noticed the existence of `blk-availability.service` in systemd.
It's a
> service that does nothing on start, and calls the `blkdeactivate`
executable on system
> shutdown, after the "special block-device" services (LVM, iSCSI, etc)
have
> stopped. `blkdeactivate` is called with the option to umount devices
in use. But I
> don't see how it can ever succeed for the system root: other services
will still be
> shutting down, and systemd's unmounting phase will not have been
reached yet. The same
> might hold true for non-system-root mounts as well, if services that
depend on them are in
> the same situation.
>
> My understanding was that special block-device handling was a task
performed by dracut in
> the initramfs. It does have a shutdown hook called `dm-shutdown.sh`
that uses the
> `dmsetup` executable to remove any device-mapper devices still
enabled. I don't see
> any shutdown hooks for the LVM module, so I assume the DM module also
takes care of them.
> Is my understanding correct?
>
> Wouldn't it be possible to replace the custom DM hook with a call to
`blkdeactivate`,
> and remove the `blk-availability` service from the "normal root"
shutdown? Could
> that possibly work better than the current setup, since
`blkdeactivate` claims to be
> capable to handle nested device-mapper setups, and to be able to use
LVM commands in a
> more intelligent way (for example, deactivating whole volume groups
at once)?
> Shouldn't `blkactivate` at least be told not to unmount the root, as
it will always
> fail?
>
> Apologies if I said anything egregiously wrong, and I'd be glad to be
corrected in
> that case.
>
> [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1097322
I did some extra investigation which made things clearer for me, but
still seems to show some pain points.
I found out the cause of LVM not being deactivated: firewalld's service
stop removes some modules from the kernel (specifically nf_conntrack),
and that hangs, leaving an immortal userland process (rmmod) that keeps
the root mount alive (which I reported as bug 1294415 [1]).
The service stop never finishes until the shutdown timeout is reached,
when systemd attempts to shutdown uncleanly (TERM-ing and KILL-ing
everything). The LVM deactivation rightfully fails (as there is still a
mounted filesystem on the LV), and by extension, so does the LUKS
deactivation.
blkdeactivate actually does already exclude a bunch of hardcoded
mountpoints (/ /boot /lib /lib64 /bin /sbin /var /usr /usr/lib
/usr/lib64 /usr/sbin /usr/bin). It still seems to me like an
insufficient solution: it's perfectly possible, if not likely, that an
userland process part of a service that is not yet finished holds an
open file in a non-excluded mountpoint. That will cause blkdeactivate to
fail at it's job at unmounting (as it is currently called in
blk-availability.service) and device deactivation.
Unmounting should be systemd's job, as it holds the dependency tree and
can know with reasonable certainty that all userland processes have been
terminated if at all possible. blkdeactivate could then be run from the
initramfs root without doing any unmounting, as it will have already
been done if possible.
I also found a strange interaction that I suspect could cause some
failures in unusual situations. Dracut's current shutdown procedure
calls to /usr/lib/dracut/dracut-initramfs-restore through the stop
command of dracut-shutdown.service. It prepares a new system root for
late shutdown, that systemd will pivot to after being done. The script
attempts to mount /boot read-only unconditionally, which is normally
fine, as the service depends on boot.mount, and will only be stopped
after it is unmounted. But if the unmounting fails and /boot is still up
for some reason, the script will fail completely and there will be no
pivot at all, possibly leaving activated block devices at poweroff time.
I don't know if any of these two suggestions will solve the problems
reported in the bug I originally linked, but they seem like worthwhile
improvements either way. I would be thankful if anyone with good
knowledge of Dracut could offer some insight on it.
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1294415
Regards,
Daniel
--
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
http://lists.fedoraproject.org/admin/lists/devel@xxxxxxxxxxxxxxxxxxxxxxx