On 13/09/2024 12:49, Dave Young wrote: > On Fri, 13 Sept 2024 at 19:13, Dave Young <dyoung@xxxxxxxxxx> wrote: >> >> On Fri, 13 Sept 2024 at 19:07, Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>> >>> >>> >>> On 13/09/2024 11:56, Dave Young wrote: >>>> On Thu, 12 Sept 2024 at 22:15, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: >>>>> >>>>> (cc Dave) >>>> >>>> Thanks for ccing me. >>>> >>>>> >>>>> Full thread here: >>>>> https://lore.kernel.org/all/CAMj1kXG1hbiafKRyC5qM1Vj5X7x-dmLndqqo2AYnHMRxDz-80w@xxxxxxxxxxxxxx/T/#u >>>>> >>>>> On Thu, 12 Sept 2024 at 16:05, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: >>>>>> >>>>>> On Thu, 12 Sept 2024 at 15:55, Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 12/09/2024 14:10, Ard Biesheuvel wrote: >>>>>>>> Does the below help at all? >>>>>>>> >>>>>>>> --- a/drivers/firmware/efi/tpm.c >>>>>>>> +++ b/drivers/firmware/efi/tpm.c >>>>>>>> @@ -60,7 +60,7 @@ int __init efi_tpm_eventlog_init(void) >>>>>>>> } >>>>>>>> >>>>>>>> tbl_size = sizeof(*log_tbl) + log_tbl->size; >>>>>>>> - memblock_reserve(efi.tpm_log, tbl_size); >>>>>>>> + efi_mem_reserve(efi.tpm_log, tbl_size); >>>>>>>> >>>>>>>> if (efi.tpm_final_log == EFI_INVALID_TABLE_ADDR) { >>>>>>>> pr_info("TPM Final Events table not present\n"); >>>>>>> >>>>>>> Unfortunately not. efi_mem_reserve updates e820_table, while kexec looks at /sys/firmware/memmap >>>>>>> which is e820_table_firmware. >>>>>>> >>>>>>> arch_update_firmware_area introduced in the RFC patch does the same thing as efi_mem_reserve does at >>>>>>> its end, just with e820_table_firmware instead of e820_table. >>>>>>> i.e. efi_mem_reserve does: >>>>>>> e820__range_update(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED); >>>>>>> e820__update_table(e820_table); >>>>>>> >>>>>>> while arch_update_firmware_area does: >>>>>>> e820__range_update_firmware(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED); >>>>>>> e820__update_table(e820_table_firmware); >>>>>>> >>>>>> >>>>>> Shame. >>>>>> >>>>>> Using efi_mem_reserve() is appropriate here in any case, but I guess >>>>>> kexec on x86 needs to be fixed to juggle the EFI memory map, memblock >>>>>> table, and 3 (!) versions of the E820 table in the correct way >>>>>> (e820_table, e820_table_kexec and e820_table_firmware) >>>>>> >>>>>> Perhaps we can put this additional logic in x86's implementation of >>>>>> efi_arch_mem_reserve()? AFAICT, all callers of efi_mem_reserve() deal >>>>>> with configuration tables produced by the firmware that may not be >>>>>> reserved correctly if kexec looks at e820_table_firmware[] only. >>>>> >>>> >>>> I have not read all the conversations, let me have a look and response later. >>>> >>>> The first glance about the patch is that I think the kexec_file_load >>>> syscall (default of latest kexec-tools) will not use >>>> e820_table_firmware AFAIK. it will only use e820_table_kexec. >>> >>> I initially thought that as well. But it looks like kexec just reads /sys/firmware/memmap >>> >>> https://github.com/horms/kexec-tools/blob/main/kexec/firmware_memmap.h#L29 >>> >>> which is e820_table_firmware. >> >> That piece of code is only used by kexec_load >> >>> >>> The patch that Ard sent in https://lore.kernel.org/all/20240912155159.1951792-2-ardb+git@xxxxxxxxxx/ >>> is the right approach to it I believe, and I dont see the issue anymore after applying that patch. >>> >>>> >>>> Usama, can you confirm how you tested this? >>>> kexec -c -l will use kexec_load syscall >>> >>> I am currently testing in my VM setup with kexec_load. But production is running >>> kexec_file_load and has the same issue. >> >> Ok, I mean efi_mem_reserve should be able to work if you retest with >> kexec_file_load. > > Hold on, I'm not sure about above :( > > checking the efi_arch_mem_reserve(), currently it updates the e820 > table, if you update the e820_table_kexec and e820_table_firmware then > I think both kexec_load and kexec_file_load will work. > > Anyway I was not aware very much about the firmware e820 tables and > kexec tables when they were created. I suspect that a cleanup and > revisit is needed. I will have a look at that. Yes, I feel like there is one too many tables! From reading the code I understand that /sys/firmware/memmap should contain the untouched map at time of boot, i.e. e820_table_firmware. But I would be in favour of getting rid of e820_table_firmware, and just having e820_table_kexec. And /sys/firmware/memmap gets data from e820_table_kexec. > > For Ard's fix to allocate it as ACPI memory, I think it should be good > and simpler. > Agreed! >> >>> >>> Thanks, >>> Usama >>> >>>> kexec [-s] -l will use kexec_file_load syscall >>>> >>>> Thanks >>>> Dave >>>> >>> >