Hi Coiby,
I've started working on a patchset for systemd utility. I have one
question/suggestion:
On 10/01/2024 08:15, Coiby Xu wrote:
LUKS is the standard for Linux disk encryption. Many users choose LUKS
and in some use cases like Confidential VM it's mandated. With kdump
enabled, when the 1st kernel crashes, the system could boot into the
kdump/crash kernel and dump the memory image i.e. /proc/vmcore to a
specified target. Currently, when dumping vmcore to a LUKS
encrypted device, there are two problems,
- Kdump kernel may not be able to decrypt the LUKS partition. For some
machines, a system administrator may not have a chance to enter the
password to decrypt the device in kdump initramfs after the 1st kernel
crashes; For cloud confidential VMs, depending on the policy the
kdump kernel may not be able to unseal the key with TPM and the
console virtual keyboard is untrusted
- LUKS2 by default use the memory-hard Argon2 key derivation function
which is quite memory-consuming compared to the limited memory reserved
for kdump. Take Fedora example, by default, only 256M is reserved for
systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
to be reserved for kdump. Note if the memory reserved for kdump can't
be used by 1st kernel i.e. an user sees ~1300M memory missing in the
1st kernel.
Besides users (at least for Fedora) usually expect kdump to work out of
the box i.e. no manual password input is needed. And it doesn't make
sense to derivate the key again in kdump kernel which seems to be
redundant work.
This patch set addresses the above issues by reusing the LUKS volume key
in kdump kernel with the help of cryptsetup's new APIs
(--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
this kdump copy of LUKS volume key,
1. After the 1st kernel loads the initramfs during boot, systemd
use an user-input passphrase or TPM-sealed key to de-crypt the LUKS
volume key and then save the volume key to specified keyring
(using the --link-vk-to-keyring API) and the key will expire within
specified time.
2. A user space tool (kdump initramfs builder) writes the key description to
/sys/kernel/crash_dm_crypt_key to inform the 1st kernel to save a
temporary copy of the volume key while building the kdump initramfs
So this volume key copy cached by systemd utility in 1st kernel does not
have to be readable from userspace.
3. The kexec_file_load syscall saves the temporary copy of the volume
key to kdump reserved memory and wipe the copy.
4. When the 1st kernel crashes and the kdump initramfs is booted, the kdump
initramfs asks the kdump kernel to create a user key using the
key stored in kdump reserved memory by writing the key
description to /sys/kernel/crash_dm_crypt_key. Then the LUKS
encrypted devide is unlocked with libcryptsetup's
--volume-key-keyring API.
Unlike here where it has to readable from uspace so that libcryptsetup
can verify the volume key.
Is it correct?
O.
5. The system gets rebooted to the 1st kernel after dumping vmcore to
the LUKS encrypted device is finished
After libcryptsetup saving the LUKS volume key to specified keyring,
whoever takes this should be responsible for the safety of this copy
of key. This key will be saved in the memory area exclusively reserved
for kdump where even the 1st kernel has no direct access. And further
more, two additional protections are added,
- save the copy randomly in kdump reserved memory as suggested by Jan
- clear the _PAGE_PRESENT flag of the page that stores the copy as
suggested by Pingfan
This patch set only supports x86. There will be patches to support other
architectures once this patch set gets merged.
_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec