On Thu, 31 Oct 2024 at 08:55, Jiri Slaby <jirislaby@xxxxxxxxxx> wrote: > > On 25. 10. 24, 9:30, Ard Biesheuvel wrote: > > To me, it seems like the use of EFI_ACPI_RECLAIM_MEMORY in this case > > simply tickles a bug in the firmware that causes it to corrupt the > > memory attributes table. The fact that cold boot behaves differently > > is a strong indicator here. > > > > I didn't see the results of the memory attribute table dumps on the > > bugzilla thread, but dumping this table from EFI is not very useful > > because it will get regenerated/updated at ExitBootServices() time. > > Unfortunately, that also takes away the console so capturing the state > > of that table before the EFI stub boots the kernel is not an easy > > thing to do. > > > > Is the memattr table completely corrupted? It also has a version > > field, and only versions 1 and 2 are defined so we might use that to > > detect corruption. > > So from a today test: > https://bugzilla.suse.com/attachment.cgi?id=878296 > > > efi: memattr: efi_memattr_init: tab=0x7752f018 ver=1 > size=16+2*1705287680=0x00000000cb494010 > > version is NOT corrupted :). > OK, so the struct looks like this typedef struct { u32 version; u32 num_entries; u32 desc_size; u32 flags; efi_memory_desc_t entry[]; } efi_memory_attributes_table_t; and in the correct case, num_entries == 45 and desc_size == 48. It is quite easy to sanity check this structure: desc_size should be equal to the desc_size in the memory map, and num_entries can never exceed 2x the number of entries in the EFI memory map. I'll go and implement something that performs the check right after ExitBootServices(), and just drops the table if it is bogus (it isn't that important anyway)