Hi James, On Wed, 02 Jun 2021 15:22:00 +0100, James Morse <james.morse@xxxxxxx> wrote: > > Hi Marc, > > On 29/04/2021 14:35, Marc Zyngier wrote: > > It recently became apparent that using kexec with kexec_file_load() on > > arm64 is pretty similar to playing Russian roulette. > > > > Depending on the amount of memory, the HW supported and the firmware > > interface used, your secondary kernel may overwrite critical memory > > regions without which the secondary kernel cannot boot (the GICv3 LPI > > tables being a prime example of such reserved regions). > > > > It turns out that there is at least two ways for reserved memory > > regions to be described to kexec: /proc/iomem for the userspace > > implementation, and memblock.reserved for kexec_file. > > One is spilled into the other by request_standard_resources()... > > > > And of course, > > our LPI tables are only reserved using the resource tree, leading to > > the aforementioned stamping. > > Presumably well after efi_init() has run... Yup, much later. And we can keep on reserving memory as long as we boot new CPUs. Having it as a one-off sync doesn't really help here. > > > Similar things could happen with ACPI tables as well. > > efi_init() calls reserve_regions(), which has: > | /* keep ACPI reclaim memory intact for kexec etc. */ > | if (md->type == EFI_ACPI_RECLAIM_MEMORY) > | memblock_reserve(paddr, size); > > This is also what stops mm from allocating them, as > memblock-reserved gets copied into the PG_Reserved flag by > free_low_memory_core_early()'s calls to reserve_bootmem_region(). > > Is your machines firmware putting them in a region with a different type? Good question. Moritz (cc'd) saw the tables being overwritten on his system (which I don't have access to), so I guess this is not entirely clear cut how this happens. My SQ box reports the ACPI region as "ACPI Reclaim", so I guess it works as expected here. > (The UEFI spec has something to say: see 2.3.6 "AArch64 Platforms": > | ACPI Tables loaded at boot time can be contained in memory of type EfiACPIReclaimMemory > | (recommended) or EfiACPIMemoryNVS > > NVS would fail the is_usable_memory() check earlier, so gets treated > as nomap) Note that I've since changed tactics and proposed that we fully rely on the resource tree instead[1]. Thanks, M. [1] https://lore.kernel.org/r/20210531095720.77469-1-maz@xxxxxxxxxx -- Without deviation from the norm, progress is not possible. _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec