Add kexec list in cc On Sat, 9 Sept 2023 at 19:34, Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > > On Fri, Sep 08, 2023 at 06:17:53PM +0200, Ard Biesheuvel wrote: > > On Fri, Sep 8, 2023 at 5:58 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > > > > On Fri, Sep 08, 2023 at 03:32:33PM +0300, Kirill A. Shutemov wrote: > > > > On Fri, Sep 08, 2023 at 02:02:30PM +0800, Aaron Lu wrote: > > > > > On Thu, Sep 07, 2023 at 04:14:09PM +0300, Kirill A. Shutemov wrote: > > > > > > On Tue, Aug 29, 2023 at 10:04:51PM +0800, Aaron Lu wrote: > > > > > > > > Could you show dmesg of the first kernel before kexec? > > > > > > > > > > > > > > Attached. > > > > > > > > > > > > > > BTW, kexec is invoked like this: > > > > > > > kver=6.4.0-rc5-00009-g75d090fd167a > > > > > > > kdir=$HOME/kernels/$kver > > > > > > > sudo kexec -l $kdir/vmlinuz-$kver --initrd=$kdir/initramfs-$kver.img --append="root=UUID=4381321e-e01e-455a-9d46-5e8c4c5b2d02 ro net.ifnames=0 acpi_rsdp=0x728e8014 no_hash_pointers sched_verbose selinux=0" > > > > > > > > > > > > I don't understand why it happens. > > > > > > > > > > > > Could you check if this patch changes anything: > > > > > > > > > > > > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c > > > > > > index 94b7abcf624b..172c476ff6f3 100644 > > > > > > --- a/arch/x86/boot/compressed/misc.c > > > > > > +++ b/arch/x86/boot/compressed/misc.c > > > > > > @@ -456,10 +456,12 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, > > > > > > > > > > > > debug_putstr("\nDecompressing Linux... "); > > > > > > > > > > > > +#if 0 > > > > > > if (init_unaccepted_memory()) { > > > > > > debug_putstr("Accepting memory... "); > > > > > > accept_memory(__pa(output), __pa(output) + needed_size); > > > > > > } > > > > > > +#endif > > > > > > > > > > > > __decompress(input_data, input_len, NULL, NULL, output, output_len, > > > > > > NULL, error); > > > > > > -- > > > > > > > > > > It solved the problem. > > > > > > > > Looks like increasing BOOT_INIT_PGT_SIZE fixes the issue. I don't yet > > > > understand why and how unaccepted memory is involved. I will look more > > > > into it. > > > > > > > > Enabling CONFIG_RANDOMIZE_BASE also makes the issue go away. > > > > > > Is this perhaps just luck? I.e. does is break ever on, say, 1000 boot > > > attempts? (i.e. maybe some position is bad and KASLR happens to usually > > > avoid it?) > > Yes, it can be luck. > > > > > Kees, maybe you have a clue? > > > > > > The only thing I can think of is that something isn't being counted > > > correctly due to the size of code, and it just happens that this commit > > > makes the code large enough to exceed some set of mappings? > > > > > > > > > > > diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h > > > > index 9191280d9ea3..26ccce41d781 100644 > > > > --- a/arch/x86/include/asm/boot.h > > > > +++ b/arch/x86/include/asm/boot.h > > > > @@ -40,7 +40,7 @@ > > > > #ifdef CONFIG_X86_64 > > > > # define BOOT_STACK_SIZE 0x4000 > > > > > > > > -# define BOOT_INIT_PGT_SIZE (6*4096) > > > > +# define BOOT_INIT_PGT_SIZE (7*4096) > > > > > > That's why this might be working, for example? How large is the boot > > > image before/after the commit, etc? > > > > > > > Not sure why these changes would make a difference here, but choking > > on accept_memory() on a non-TDX suggests that init_unaccepted_memory() > > is poking into unmapped memory before it even decides that the > > unaccepted memory does not exist. > > > > init_unaccepted_memory() has > > > > ret = efi_get_conf_table(boot_params, &cfg_table_pa, &cfg_table_len); > > if (ret) { > > warn("EFI config table not found."); > > return false; > > } > > > > which looks for <guid, phys_addr> tuples in an array pointed to by the > > EFI system table, and if either of those is not mapped, things can be > > expected to explode. > > > > The only odd thing there is that this code is invoked after setting up > > the 'demand paging' logic in the decompressor. > > > > If you haven't yet, could you please retry the kexec boot with > > earlyprintk=tty<insert your UART params here>? > > early console in extract_kernel > input_data: 0x000000807eb433a8 > input_len: 0x0000000000d26271 > output: 0x000000807b000000 > output_len: 0x0000000004800c10 > kernel_total_size: 0x0000000003e28000 > needed_size: 0x0000000004a00000 > trampoline_32bit: 0x000000000009d000 > > Decompressing Linux... out of pgt_buf in arch/x86/boot/compressed/ident_map_64.c!? > pages->pgt_buf_offset: 0x0000000000006000 > pages->pgt_buf_size: 0x0000000000006000 > > > Error: kernel_ident_mapping_init() failed > > It crashes on #PF due to stbl->nr_tables dereference in > efi_get_conf_table() called from init_unaccepted_memory(). > > I don't see anything special about stbl location: 0x775d6018. > > One other bit of information: disabling 5-level paging also helps the > issue. > > I will debug further. > > -- > Kiryl Shutsemau / Kirill A. Shutemov > _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec