On 27.01.2024 00:40, Orion Poplawski wrote:
On 1/26/24 01:21, Lennart Poettering wrote:
On Do, 25.01.24 16:28, Orion Poplawski (orion@xxxxxxxx) wrote:
We have various VMs that are back by luks encrypted LVs. At boot the volumes
are decrypted by clevis. The problem we are seeing at the moment is that the
VMs are started before the block devices are decrypted. Our current
solution is:
We generally wait for all devices listed in /etc/crypttab, unless you
set noauto or nofail.
We are setting 'nofail', because I don't think I want to fail the boot in
general. They are not required for the system itself to function, just
certain VMs. e.g:
luks-backup /dev/vg_root/backup-raw none discard,_netdev,nofail
See below for more though.
# cat /etc/systemd/system/virtqemud.service.d/override.conf
[Unit]
After=blockdev@dev-mapper-luks\x2dbackup.target
blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target
Where we list each of the volumes to be decyrpted as blocking the virtqemud
service.
Does anyone have any better alternatives? My main issue it that it feels
somewhere in between fine-grained and coarse-grained control.
Ideally I think one would be able to have each individual VM startup
automatically delayed until the devices each used became available, but I
don't see how to do this.
I am not sure how libvirt works, but if it runs every VM in a systemd
unit, then you could just order the device before that unit, or the
unit after the device.
Really depends on how libvirt splits things up.
I'm honestly not sure how libvirt works here either. But there seems to be this:
# rpm -qf /usr/lib/systemd/system/virtqemud.service
libvirt-daemon-driver-qemu-9.5.0-7.el9_3.alma.2.x86_64
which gets started:
Jan 25 14:42:58 systemd[1]: Starting Virtualization qemu daemon...
Jan 25 14:42:58 systemd[1]: Started Virtualization qemu daemon.
Then the qemu-kvm processes end up in their own scope:
● machine-qemu\x2d1\x2dsrv\x2dmry01.scope - Virtual Machine qemu-1-srv-mry01
Loaded: loaded
(/run/systemd/transient/machine-qemu\x2d1\x2dsrv\x2dmry01.scope; transient)
Transient: yes
Active: active (running) since Thu 2024-01-25 14:42:58 PST; 22h ago
Tasks: 6 (limit: 16384)
Memory: 15.6G
CPU: 1h 15min 44.863s
CGroup: /machine.slice/machine-qemu\x2d1\x2dsrv\x2dmry01.scope
└─libvirt
└─9086 /usr/libexec/qemu-kvm -name guest=...
Alternatively it seems like one should be able to delay all VM startup until
all volumes in /etc/crypttab were unlocked, rather than having to specify each
one. But I don't see a target for that.
This is default behaviour. Anything listed in /etc/crypttab is ordered
before cryptsetup.target, which is ordered before sysinit.target,
which is ordered before basic.target, which is ordered before regular services.
We are specifying _netdev because they require the network to unlock. This I
think puts them under remote-cryptsetup.target, and I used to depend on that.
But with EL9 I'm seeing:
# j -b -u remote-cryptsetup.target -u
'blockdev@dev-mapper-luks\x2dbackup.target' -u clevis-luks-askpass.service
--no-hostname
Jan 25 14:42:12 systemd[1]: Reached target Remote Encrypted Volumes.
Jan 25 14:42:12 systemd[1]: Started Forward Password Requests to Clevis.
Jan 25 14:42:48 clevis-luks-askpass[1706]: Unlocked /dev/vg_root/backup-raw
(UUID=d6d25a85-2d43-4780-a312-e0e9b2383807) successfully
Jan 25 14:42:54 systemd[1]: Reached target Block Device Preparation for
/dev/mapper/luks-backup.
Jan 25 14:42:59 systemd[1]: clevis-luks-askpass.service: Deactivated successfully.
# systemctl list-dependencies remote-cryptsetup.target
remote-cryptsetup.target
● ├─systemd-cryptsetup@luks\x2dbackup.service
# j --no-hostname -b -u 'systemd-cryptsetup@luks\x2dbackup.service'
Jan 25 14:42:12 systemd[1]: Starting Cryptography Setup for luks-backup...
Jan 25 14:42:42 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64,
key size 512 bits for device /dev/vg_root/backup-raw.
Jan 25 14:42:47 systemd-cryptsetup[1697]: Failed to activate with specified
passphrase. (Passphrase incorrect?)
Jan 25 14:42:48 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64,
key size 512 bits for device /dev/vg_root/backup-raw.
Jan 25 14:42:54 systemd[1]: Finished Cryptography Setup for luks-backup.
# systemctl show 'systemd-cryptsetup@luks\x2dbackup.service' | grep Type
Type=oneshot
So, if I'm following things correctly, this doesn't seem right.
remote-cryptsetup.target depends on systemd-cryptsetup@luks\x2dbackup.service.
This is a oneshot that is considered started after the main process exits,
and above is shown as 14:42:54. But we are seeing 'Reached target Remote
Encrypted Volumes' at 14:42:12.
What am I missing?
systemd-252-18.el9.x86_64
"nofail" encrypted devices are not ordered before
(remote-)cryptsetup.target to not delay startup. The reasoning is, if
you do not care whether this device exists or not, there is no reason to
globally wait for it anyway. I believe this was changed (even several
times) in the past.
If the device list is static, just add configuration snippets to
explicitly order their blockdev@ services before
remote-cryptsetup.target. /etc/fstab generator supports x-systemd.before
(and others), may be it could be generalized to /etc/crypttab generator.