Re: [RFC] kexec: Use bpf to allow kexec to load PE format boot image

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Pingfan,

thanks for sharing your thoughts. Please see my comments below.

On Tue, 14 Jan 2025 09:28:25 +0800
Pingfan Liu <piliu@xxxxxxxxxx> wrote:

> Nowadays UEFI PE bootable image is more and more popular on the distribution.
> But it is still an open issue to load that kind of image by kexec with IMA enabled
> 
> *** A brief review of the history ***
> There are two categatories methods to handle this issue.
>   -1. UEFI service emulator for UEFI stub
>   -2. PE format parser
> 
> For the first one, I have tried a purgatory-style emulator [1]. But it
> confronts the hardware scaling trouble.  For the second one, there are two
> choices, one is to implement it inside the kernel, the other is inside the user
> space.  Both zboot-format [2] and UKI-format [3] parsers are rejected due to
> the concern that the variant format parsers will inflate the kernel code.  And
> finally, we have these kinds of parsers in the user space 'kexec-tools'.
> 
> 
> From the beginning, it has been perceived that the user space parser can not
> satisfy the requirement of security-boot without an extra embeded signature.
> This issue was suspended at that time. 
> 
> But now, more and more users expect the security feature and want the
> kexec_file_load to guarantee it by IMA.  I tried to fix that issue by the extra
> embeded signature method. But it is also disliked.
> 
> Enlighted by Philipp suggestion about implementing systemd-stub in bpf opcode in the discussion to [1],
> I turn to the bpf and hope that parsers in bpf-program can resolve this issue. 
> 
> [1]: https://lore.kernel.org/lkml/20240819145417.23367-1-piliu@xxxxxxxxxx/T/
> [2]: https://lore.kernel.org/kexec/20230306030305.15595-1-kernelfans@xxxxxxxxx/
> [3]: https://lore.kernel.org/lkml/20230911052535.335770-1-kernel@xxxxxxxx/
> [4]: https://lore.kernel.org/linux-arm-kernel/20230921133703.39042-2-kernelfans@xxxxxxxxx/T/
> 
> 
> 
> 
> *** Reflect the problem and a new proposal ***
> 
> The UEFI emulator is anchored at the UEFI spec. That will incur lots of work
> due to various hardware support.  For example, to support TPM, the emulator
> should implement PCI/I2C bus protocol.
> 
> But if the problem is confined to the original linux kernel boot protocol, it will be simple.
> Only three things should be considered: the kernel image, the initrd and the command line.
> If we can get them in a security way, we can tackle the problem.
> 
> The integrity of the file is ensured under the protection of the signature
> envelope.  If the kexeced files are parsed in the user space, the envelopes are
> opened and invalid.  So they should sink into the kernel space, be verified and
> be manipulated there.  And to manipulate the various format file, we need
> bpf-program, which know their format.
> 
> There are three parties in this solution
> -1. The kexec-tools itself is protected by IMA, and it creates a bpf-map and
> update UKI addon file names into the map. Later, the bpf-program will call
> bpf-helper to pad these files into initrd
> 
> -2. The bpf-program is contained in a dedicated '.bpf' section in PE file. When
> kexec_file_load a PE image, it extract the '.bpf' section and reflect it to the
> user space through procfs. And kexec-tools starts the program.  By this way,
> the bpf-program itself is free from tampering. 
> 
> The bpf-program completes two things:
> 	-1.parse the image format
> 	-2.call bpf kexec helpers to manipulate signed files
> 
> -3. The bpf helpers. There will be three helpers introduced.
> The first one for the data exchange between the bpf-program and the kernel.
> The second one for the decompressor.
> The third one for the manipulation of the cpio

I find this design very complicated. Especially I don't like that the
bpf program is exported back to user space to be loaded separately.
This does not only requires us to protect kexec-tools by IMA but also
all the tools and libraries that are involved in running kexec-tools
(libc, ld, etc.). But even that will probably not be enough when you
look at all the different ways user space programs can interact with
each other and change each others behavior (see the xz-backdoor for
example). So when we would probably need to protect all of user space
if we want to use this design in a secure boot environment, which is
out of scope for the feature.

Alternatively, we would need to verify in the kernel that the bpf
program loaded by the kexec-tools is identical to the one included in
the kernel image. But then what's the point in exporting it in the
first place? Especially as already today there is the
kernel/bpf/syscall.c:kern_sys_bpf function that allows to run the
bpf syscall from within the kernel (with some limitations, but it
allows to load a program).

All in all I think it is better to keep the current design, i.e.
kexec-tools only makes one systemcall and the rest is done in the
kernel.

In addition, while I agree that ideally we include the new feature
into kexec_file_load, I think it's better to define a new system call
for images containing bpf in the beginning. With that we have a blank
slate we can mess with without the need to take care of keeping the old
code working. Plus, it leaves us a fallback to load a dump kernel when
we mess up. Once we have a working prototype and a better understanding
on what is needed we can still merge it back into kexec_file_load.

> ***  Overview of the design in Pseudocode ***
> 
> 
> ThreadA: kexec thread which invokes kexec_file_load
> ThreadB: the dedicated thread in kexec-tools to load bpf-prog
> ------
> Diag 1. the interaction between bpf-prog loader and its executer
> 
> 
> ThreadA						ThreadB
> 
> 						wait on eventfd_A
> 
> 
> expose bpf-prog through procfs
> & signal eventfd_A
> & wait on eventfd_B
> 
> 						read the bpf-prog from procfs
> 						& initialize the bpf and install it to the fentry
> 						& signal eventfd_B
> 						& wait on eventfd_A again
> 						
> fentry executes bpf-prog to parse image
> & generate output for the next stop
> 
> 
> -------------------
> Diag 2. bpf-prog
> 
> SEC("fentry/kexec_pe_parser_hook")
> int BPF_PROG(pe_parser, struct kimage *image, ...)
> {
> 
> 	buf = bpf_ringbuf_reserve(rb, size);
> 	buf_result = bpf_ringbuf_reserve(rb, res_sz);
> 	/* Ask kernel to copy the resource content to here */
> 	bpf_helper_carrier(resource_name, buf, size, in);
> 	
> 	/* Parse the format laying on buf */
> 	...
> 	/* call extra bpf-helpers */
> 	...
> 	
> 	/* Ask kernel to copy the resource content from here */
> 	bpf_helper_carrier(resource_name, buf_result, res_sz, out);
> 
> }
> 
> At present, bpf map functions provides the mechanism to exchange the data between the user space and bpf-prog.
> But for bpf-prog and the kernel, there is no good choice. So I introduce a bpf helper function
> 	bpf_helper_carrier(resource_name, buf, size, in)
> 
> The above code implements the data exchange between the kernel and bpf-prog.
> By this way, the data parsing process is not exposed to the user space any longer.
> 
> 
> 
> extra bpf-helpers:
> 
> 	/* Decompress the compressed kernel image */
> 	bpf_helper_decompress(src, src_size, dst, dst_sz)
> 	
> 	/* 
> 	 * Verify the signature of @addon_filename, padding it to initrd's dir @dst_dir
> 	 */
> 	bpf_helper_supplement_initrd(dst_dir, addon_filename)

UKI addons can also append entries to the kernel command line. IMHO it
will be easiest when we maintain the initrd and command line in the
kernel, i.e. the syscall "prepopulates" the initrd and cmdline either
from the UKI or what kexec-tools provides. The bpf program then only
updates them. That's not ideal but it keeps the bpf program simple in
the beginning so we (hopefully) don't run into the limitations bpf
programs have. Once we have a working prototype we can still move
functionality over to the bpf program.

The way I see it this should work with three helper functions.
One to read+verify a file and one each to append data to the initrd or
command line.

> 	Note: Due to the UEFI environment (such as edk2) only providing basic
>         file operations for FAT filesystems, any UEFI-stub PE image (like systemd-stub)
>         is restricted to these basic operation services.  As a result, the
>         functionality of such bpf-kexec helpers is inherently limited.

Is this limitation really necessary? The way I see it this is a
limitation to keep the UEFI environment simple. But when we run kexec
the kernel is fully booted. So we can make use of all the file systems
included in the kernel.

Thanks
Philipp

> *** Thoughts about the basic operation *** 
> 
> The basic operations have influence on the stability of bpf-kexec-helpers.
> 
> The kexec_file_load faces three kinds of elements: linux-kernel, initrd and cmdline.
> 
> For the kernel, on arm64 or riscv, in order to get the bootable image from the compressed data,
> there should be a bpf-helper function as a wrapper of __decompress()
> 
> For initrd, systemd-sysext may require padding extra file into initrd
> 
> For cmdline, it may require some string trim or conjoin.
> 
> Overall, these user requirements are foreseeable and straightforward,
> suggesting that bpf-kexec-helpers will likely remain stable without significant
> changes.
> 
> 
> Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> Cc: Jeremy Linton <jeremy.linton@xxxxxxx>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Simon Horman <horms@xxxxxxxxxx>
> Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx>
> Cc: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> Cc: Philipp Rudo <prudo@xxxxxxxxxx>
> Cc: Jan Hendrik Farr <kernel@xxxxxxxx>
> Cc: Baoquan He <bhe@xxxxxxxxxx>
> Cc: Dave Young <dyoung@xxxxxxxxxx>
> Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx>
> Cc: Pingfan Liu <piliu@xxxxxxxxxx>
> To: kexec@xxxxxxxxxxxxxxxxxxx
> To: bpf@xxxxxxxxxxxxxxx
> 





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux