On 13/01/2025 11:27, Usama Arif wrote: > > > On 13/01/2025 02:33, Dave Young wrote: >> On Fri, 10 Jan 2025 at 18:54, Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>> >>> >>> >>> On 10/01/2025 07:21, Ard Biesheuvel wrote: >>>> On Thu, 9 Jan 2025 at 17:36, Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>>>> >>>>> >>>>> >>>>> On 09/01/2025 15:45, Ard Biesheuvel wrote: >>>>>> On Wed, 8 Jan 2025 at 23:00, Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>>>>>> >>>>>>> The commit in [1] introduced a check to see if EFI memory attributes >>>>>>> table was corrupted. It assumed that efi.memmap.nr_map remains >>>>>>> constant, but it changes during late boot. >>>>>>> Hence, the check is valid during cold boot, but not in the subsequent >>>>>>> kexec boot. >>>>>>> >>>>>>> This is best explained with an exampled. At cold boot, for a test >>>>>>> machine: >>>>>>> efi.memmap.nr_map=91, >>>>>>> memory_attributes_table->num_entries=48, >>>>>>> desc_size = 48 >>>>>>> Hence, the check introduced in [1] where 3x the size of the >>>>>>> entire EFI memory map is a reasonable upper bound for the size of this >>>>>>> table is valid. >>>>>>> >>>>>>> In late boot __efi_enter_virtual_mode calls 2 functions that updates >>>>>>> efi.memmap.nr_map: >>>>>>> - efi_map_regions which reduces the `count` of map entries >>>>>>> (for e.g. if should_map_region returns false) and which is reflected >>>>>>> in efi.memmap by __efi_memmap_init. >>>>>>> At this point efi.memmap.nr_map becomes 46 in the test machine. >>>>>>> - efi_free_boot_services which also reduces the number of memory regions >>>>>>> available (for e.g. if md->type or md->attribute is not the right value). >>>>>>> At this point efi.memmap.nr_map becomes 9 in the test machine. >>>>>>> Hence when you kexec into a new kernel and pass efi.memmap, the >>>>>>> paramaters that are compared are: >>>>>>> efi.memmap.nr_map=9, >>>>>>> memory_attributes_table->num_entries=48, >>>>>>> desc_size = 48 >>>>>>> where the check in [1] is no longer valid with such a low efi.memmap.nr_map >>>>>>> as it was reduced due to efi_map_regions and efi_free_boot_services. >>>>>>> >>>>>>> A more appropriate check is to see if the description size reported by >>>>>>> efi and memory attributes table is the same. >>>>>>> >>>>>>> [1] https://lore.kernel.org/all/20241031175822.2952471-2-ardb+git@xxxxxxxxxx/ >>>>>>> >>>>>>> Fixes: 8fbe4c49c0cc ("efi/memattr: Ignore table if the size is clearly bogus") >>>>>>> Reported-by: Breno Leitao <leitao@xxxxxxxxxx> >>>>>>> Signed-off-by: Usama Arif <usamaarif642@xxxxxxxxx> >>>>>>> --- >>>>>>> drivers/firmware/efi/memattr.c | 16 ++++++---------- >>>>>>> 1 file changed, 6 insertions(+), 10 deletions(-) >>>>>>> >>>>>> >>>>>> The more I think about this, the more I feel that kexec on x86 should >>>>>> simply discard this table, and run with the firmware code RWX (which >>>>>> is not the end of the world). >>>>> >>>>> >>>>> By discard this table, do you mean kexec not use e820_table_firmware? >>>> >>>> No, I mean kexec ignores the memory attributes table. >>>> >>>>> Also a very basic question, what do you mean by run with the firmware RWX? >>>>> >>>> >>>> The memory attributes table is an overlay for the EFI memory map that >>>> describes which runtime code regions may be mapped with restricted >>>> permissions. Without this table, everything will be mapped writable as >>>> well as executable, but only in the EFI page tables, which are only >>>> active when an EFI call is in progress. >>>> >>> >>> Thanks for explaining! >>> >>> So basically get rid of memattr.c :) >>> >>> Do you mean get rid of it only for kexec, or not do it for any >>> boot (including cold boot)? >>> I do like this idea! I couldn't find this in the git history, >>> but do you know if this was added in the linux kernel just >>> because EFI spec added support for it, or if there was a >>> specific security problem? >>> >> >> Usama, can you try the patch below? >> [ format is wrong due to webmail corruption. But if it works I can >> send a formal patch later ] >> >> $ git diff arch/x86 >> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c >> index 846bf49f2508..58dc77c5210e 100644 >> --- a/arch/x86/platform/efi/quirks.c >> +++ b/arch/x86/platform/efi/quirks.c >> @@ -561,6 +561,11 @@ int __init efi_reuse_config(u64 tables, int nr_tables) >> >> if (!efi_guidcmp(guid, SMBIOS_TABLE_GUID)) >> ((efi_config_table_64_t *)p)->table = data->smbios; >> + >> + /* Not bother to play with mem attr table across kexec */ >> + if (!efi_guidcmp(guid, EFI_MEMORY_ATTRIBUTES_TABLE_GUID)) >> + ((efi_config_table_64_t *)p)->table = >> EFI_INVALID_TABLE_ADDR; >> + >> p += sz; >> } >> > > This would work, I am guessing it will have a similar effect to what I sent > last week in > https://lore.kernel.org/all/fd63613c-fd26-42de-b5ed-cc734f72eb36@xxxxxxxxx/ > > I think it needs to be wrapped in ifdef CONFIG_X86_64. > IMO we should consider the 2 patches in this series first before disabling it for kexec. These patches actually fix the issue.