But now, more and more users expect the security feature and want the kexec_file_load to guarantee it by IMA. I tried to fix that issue by the extra embeded signature method. But it is also disliked. Enlighted by Philipp suggestion about implementing systemd-stub in bpf opcode in the discussion to [1], I turn to the bpf and hope that parsers in bpf-program can resolve this issue. [1]: https://lore.kernel.org/lkml/20240819145417.23367-1-piliu@xxxxxxxxxx/T/ [2]: https://lore.kernel.org/kexec/20230306030305.15595-1-kernelfans@xxxxxxxxx/ [3]: https://lore.kernel.org/lkml/20230911052535.335770-1-kernel@xxxxxxxx/ [4]: https://lore.kernel.org/linux-arm-kernel/20230921133703.39042-2-kernelfans@xxxxxxxxx/T/ *** Reflect the problem and a new proposal *** The UEFI emulator is anchored at the UEFI spec. That will incur lots of work due to various hardware support. For example, to support TPM, the emulator should implement PCI/I2C bus protocol. But if the problem is confined to the original linux kernel boot protocol, it will be simple. Only three things should be considered: the kernel image, the initrd and the command line. If we can get them in a security way, we can tackle the problem. The integrity of the file is ensured under the protection of the signature envelope. If the kexeced files are parsed in the user space, the envelopes are opened and invalid. So they should sink into the kernel space, be verified and be manipulated there. And to manipulate the various format file, we need bpf-program, which know their format. There are three parties in this solution -1. The kexec-tools itself is protected by IMA, and it creates a bpf-map and update UKI addon file names into the map. Later, the bpf-program will call bpf-helper to pad these files into initrd -2. The bpf-program is contained in a dedicated '.bpf' section in PE file. When kexec_file_load a PE image, it extract the '.bpf' section and reflect it to the user space through procfs. And kexec-tools starts the program. By this way, the bpf-program itself is free from tampering. The bpf-program completes two things: -1.parse the image format -2.call bpf kexec helpers to manipulate signed files -3. The bpf helpers. There will be three helpers introduced. The first one for the data exchange between the bpf-program and the kernel. The second one for the decompressor. The third one for the manipulation of the cpio *** Overview of the design in Pseudocode *** ThreadA: kexec thread which invokes kexec_file_load ThreadB: the dedicated thread in kexec-tools to load bpf-prog ------ Diag 1. the interaction between bpf-prog loader and its executer ThreadA ThreadB wait on eventfd_A expose bpf-prog through procfs & signal eventfd_A & wait on eventfd_B read the bpf-prog from procfs & initialize the bpf and install it to the fentry & signal eventfd_B & wait on eventfd_A again fentry executes bpf-prog to parse image & generate output for the next stop ------------------- Diag 2. bpf-prog SEC("fentry/kexec_pe_parser_hook") int BPF_PROG(pe_parser, struct kimage *image, ...) { buf = bpf_ringbuf_reserve(rb, size); buf_result = bpf_ringbuf_reserve(rb, res_sz); /* Ask kernel to copy the resource content to here */ bpf_helper_carrier(resource_name, buf, size, in); /* Parse the format laying on buf */ ... /* call extra bpf-helpers */ ... /* Ask kernel to copy the resource content from here */ bpf_helper_carrier(resource_name, buf_result, res_sz, out); } At present, bpf map functions provides the mechanism to exchange the data between the user space and bpf-prog. But for bpf-prog and the kernel, there is no good choice. So I introduce a bpf helper function bpf_helper_carrier(resource_name, buf, size, in) The above code implements the data exchange between the kernel and bpf-prog. By this way, the data parsing process is not exposed to the user space any longer. extra bpf-helpers: /* Decompress the compressed kernel image */ bpf_helper_decompress(src, src_size, dst, dst_sz) /* * Verify the signature of @addon_filename, padding it to initrd's dir @dst_dir */ bpf_helper_supplement_initrd(dst_dir, addon_filename) Note: Due to the UEFI environment (such as edk2) only providing basic file operations for FAT filesystems, any UEFI-stub PE image (like systemd-stub) is restricted to these basic operation services. As a result, the functionality of such bpf-kexec helpers is inherently limited. *** Thoughts about the basic operation *** The basic operations have influence on the stability of bpf-kexec-helpers. The kexec_file_load faces three kinds of elements: linux-kernel, initrd and cmdline. For the kernel, on arm64 or riscv, in order to get the bootable image from the compressed data, there should be a bpf-helper function as a wrapper of __decompress() For initrd, systemd-sysext may require padding extra file into initrd For cmdline, it may require some string trim or conjoin. Overall, these user requirements are foreseeable and straightforward, suggesting that bpf-kexec-helpers will likely remain stable without significant changes. Cc: Alexei Starovoitov <ast@xxxxxxxxxx> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx> Cc: John Fastabend <john.fastabend@xxxxxxxxx> Cc: Jeremy Linton <jeremy.linton@xxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Cc: Mark Rutland <mark.rutland@xxxxxxx> Cc: Simon Horman <horms@xxxxxxxxxx> Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx> Cc: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Cc: Philipp Rudo <prudo@xxxxxxxxxx> Cc: Jan Hendrik Farr <kernel@xxxxxxxx> Cc: Baoquan He <bhe@xxxxxxxxxx> Cc: Dave Young <dyoung@xxxxxxxxxx> Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx> Cc: Pingfan Liu <piliu@xxxxxxxxxx> To: kexec@xxxxxxxxxxxxxxxxxxx To: bpf@xxxxxxxxxxxxxxx