On 9/11/23 10:53, Kirill A. Shutemov wrote:
On Mon, Sep 11, 2023 at 10:33:01AM -0500, Tom Lendacky wrote:
On 9/11/23 09:57, Kirill A. Shutemov wrote:
On Mon, Sep 11, 2023 at 10:56:36PM +0800, Dave Young wrote:
early console in extract_kernel
input_data: 0x000000807eb433a8
input_len: 0x0000000000d26271
output: 0x000000807b000000
output_len: 0x0000000004800c10
kernel_total_size: 0x0000000003e28000
needed_size: 0x0000000004a00000
trampoline_32bit: 0x000000000009d000
Decompressing Linux... out of pgt_buf in arch/x86/boot/compressed/ident_map_64.c!?
pages->pgt_buf_offset: 0x0000000000006000
pages->pgt_buf_size: 0x0000000000006000
Error: kernel_ident_mapping_init() failed
It crashes on #PF due to stbl->nr_tables dereference in
efi_get_conf_table() called from init_unaccepted_memory().
I don't see anything special about stbl location: 0x775d6018.
One other bit of information: disabling 5-level paging also helps the
issue.
I will debug further.
The problem is not limited to unaccepted memory, it also triggers if we
reach efi_get_rsdp_addr() in the same setup.
I think we have several problems here.
- 6 pages for !RANDOMIZE_BASE is only enough for kernel, cmdline,
boot_data and setup_data if we assume that they are in different 1G
regions and do not cross the 1G boundaries. 4-level paging: 1 for PGD, 1
for PUD, 4 for PMD tables.
Looks like we never map EFI/ACPI memory explicitly.
It might work if kernel/cmdline/... are in single 1G and we have
spare pages to handle page faults.
- No spare memory to handle mapping for cc_info and cc_info->cpuid_phys;
- I didn't increase BOOT_INIT_PGT_SIZE when added 5-level paging support.
And if start pagetables from scratch ('else' case of 'if (p4d_offset...))
we run out of memory.
I believe similar logic would apply for BOOT_PGT_SIZE for RANDOMIZE_BASE=y
case.
I don't know what the right fix here. We can increase the constants to be
enough to cover existing cases, but it is very fragile. I am not sure I
saw all users. Some of them could silently handled with pagefault handler
in some setups. And it is hard to catch new users during code review.
Also I'm not sure why do we need pagefault handler there. Looks like it
just masking problems. I think everything has to be mapped explicitly.
Any comments?
There was a similar related issue around the cc_info blob that is captured
here: https://lore.kernel.org/lkml/20230601072043.24439-1-ltao@xxxxxxxxxx/
Personally, I'm a fan of mapping the EFI tables that will be passed to the
kexec/kdump kernel. To me, that seems to more closely match the valid
mappings for the tables when control is transferred to the OS from UEFI on
the initial boot.
I don't see how it would help if initialize_identity_maps() resets
pagetables. See 'else' case of 'if (p4d_offset...).
Ok, I see what you mean now.
Thanks,
Tom
_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec