On Sat, 20 Jul 2024 at 17:40, Shao, Marshall <Marshall.Shao@xxxxxxxx> wrote: > > Hi Ard, > > Thanks for your reply. Here is the problem I encountered, and the problem > still exists in the mainline: > > Firmware: UEFI 2.4, > EFI loader: systemd-boot > > When I update my kernel from LTS v6.1.x to v6.6.x, the kernel fails to > boot when it jumps from the handover protocol. And it fails only on an > old Firmware that MemoryOverwriteRequestControl(MOR) bit is set to 1. > > According to the TCG Reset Attack Mitigation Spec, a Memory Clear Method > will be loaded in the memory on boot, and this causes the EFI loader > and OS memory address to shift. Another important factor is that my > BSS region and boot_idt entries are not as clean as those in QEMU and > other platforms. > > This means that clearing BSS in the handover function can cause the > firmware's IDT/GDT corruption, as you mentioned here commit <7b8439b0369b> This commit ID does not exist, I take it you mean ebf5a79acf9a2970e93d30a9e97b08913ef15711 > (x86/efistub: Drop redundant clearing of BSS). systemd-boot does not use the handover entrypoint, it uses the native PE entrypoint, which no longer clears BSS with the commit above applied. > I noticed that most of > variables in BSS will be initialized before accessing, and only a few are > not, that's why I removed memset part in handover entry point. > It would be better not to rely on special semantics that deviate from standard C. That said, it would also be better not to rely on this funky EFI handover protocol in the first place, which is only implemented by downstream, distro versions of GRUB. Given that GRUB now supports the native EFI entrypoint properly, the handover protocol is essentially deprecated. This does not mean, of course, that we should stop supporting it. But removing the memset() there may break things in a way that only becomes apparent once the changes trickle down to systems running those older GRUBs. > In terms of the BOOT_STACK_SIZE, I found out that the boot_stack region is > used in this case (at the beginning of _bss), and 0x4000 is not sufficient > to cover this issue since the data in boot_heap will be overwritten during > decompression. EFI boot uses the EFI firmware stack, and never enters the decompressor, where the address of boot_stack_end is loaded into ESP/RSP. This means that increasing the boot stack size does not prevent an overrun, but is affecting the image layout in a different way, avoiding your boot issue. It would be nice to get to the bottom of this: can you double check that the PE image metadata is accurate? llvm-readelf -a can be used to dump the PE header of a bzImage. > Prior to image decompression in EFI stub, it was ok. > > 00000000008fc000 g .bss 0000000000000000 .hidden _bss > 00000000008fc000 l O .bss 0000000000004000 boot_stack > 0000000000900000 g O .bss 0000000000000004 .hidden spurious_nmi_count > 0000000000900000 l O .bss 0000000000000000 boot_stack_end > 0000000000900010 g O .bss 0000000000000018 .hidden pio_ops > 0000000000900028 g O .bss 0000000000000008 .hidden boot_params_ptr > 0000000000900040 l O .bss 0000000000001000 scratch.0 > 0000000000901040 l O .bss 0000000000010000 boot_heap > > My thought was that increasing the size of boot_stack should be harmless. I'm not bothered by the change itself, I just want to make sure that the fix is guaranteed to address the problem rather than paper over it. > As for the memset part, there could be an alternative way, althought it > looks a bit ugly. > > memset(_bss+0x10000, 0, _ebss - _bss - 0x10000) > So now you are applying the memset only to part of BSS, right? How does this help? > However, I've updated the diff, if you think this is more like a > workaround, please take this thread as a bug report. Thanks! > > Regards, > > Marshall Shao > > > Signed-off-by: Marshall Shao <marshall.shao@xxxxxxxx> > --- > arch/x86/boot/compressed/misc.c | 4 ++-- > arch/x86/include/asm/boot.h | 2 +- > drivers/firmware/efi/libstub/x86-5lvl.c | 2 +- > drivers/firmware/efi/libstub/x86-stub.c | 2 -- > include/linux/decompress/mm.h | 2 +- > 5 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c > index 944454306ef4..49b68f57dd18 100644 > --- a/arch/x86/boot/compressed/misc.c > +++ b/arch/x86/boot/compressed/misc.c > @@ -50,8 +50,8 @@ struct boot_params *boot_params_ptr; > > struct port_io_ops pio_ops; > > -memptr free_mem_ptr; > -memptr free_mem_end_ptr; > +memptr free_mem_ptr __section(".data"); > +memptr free_mem_end_ptr __section(".data"); > int spurious_nmi_count; > > static char *vidmem; > diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h > index 3e5b111e619d..312bc87ab027 100644 > --- a/arch/x86/include/asm/boot.h > +++ b/arch/x86/include/asm/boot.h > @@ -33,7 +33,7 @@ > #endif > > #ifdef CONFIG_X86_64 > -# define BOOT_STACK_SIZE 0x4000 > +# define BOOT_STACK_SIZE 0x10000 > > /* > * Used by decompressor's startup_32() to allocate page tables for identity > diff --git a/drivers/firmware/efi/libstub/x86-5lvl.c b/drivers/firmware/efi/libstub/x86-5lvl.c > index 77359e802181..bebae4fdfb93 100644 > --- a/drivers/firmware/efi/libstub/x86-5lvl.c > +++ b/drivers/firmware/efi/libstub/x86-5lvl.c > @@ -10,7 +10,7 @@ > > bool efi_no5lvl; > > -static void (*la57_toggle)(void *cr3); > +static void (*la57_toggle)(void *cr3) __section(".data"); > > static const struct desc_struct gdt[] = { > [GDT_ENTRY_KERNEL32_CS] = GDT_ENTRY_INIT(DESC_CODE32, 0, 0xfffff), > diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c > index 078055b054e3..ffc62af50669 100644 > --- a/drivers/firmware/efi/libstub/x86-stub.c > +++ b/drivers/firmware/efi/libstub/x86-stub.c > @@ -21,7 +21,6 @@ > #include "efistub.h" > #include "x86-stub.h" > > -extern char _bss[], _ebss[]; > > const efi_system_table_t *efi_system_table; > const efi_dxe_services_table_t *efi_dxe_table; > @@ -1059,7 +1058,6 @@ void __noreturn efi_stub_entry(efi_handle_t handle, > void efi_handover_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg, > struct boot_params *boot_params) > { > - memset(_bss, 0, _ebss - _bss); > efi_stub_entry(handle, sys_table_arg, boot_params); > } > > diff --git a/include/linux/decompress/mm.h b/include/linux/decompress/mm.h > index ac862422df15..62c04691c898 100644 > --- a/include/linux/decompress/mm.h > +++ b/include/linux/decompress/mm.h > @@ -36,7 +36,7 @@ > /* A trivial malloc implementation, adapted from > * malloc by Hannu Savolainen 1993 and Matthias Urlichs 1994 > */ > -STATIC_RW_DATA unsigned long malloc_ptr; > +STATIC_RW_DATA unsigned long malloc_ptr __section(".data"); > STATIC_RW_DATA int malloc_count; > > MALLOC_VISIBLE void *malloc(int size) > -- > 2.34.1 >