On Wed, 19 May 2021 16:19:44 +0100, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > On Thu, Apr 29, 2021 at 02:35:31PM +0100, Marc Zyngier wrote: > > It recently became apparent that using kexec with kexec_file_load() on > > arm64 is pretty similar to playing Russian roulette. > > > > Depending on the amount of memory, the HW supported and the firmware > > interface used, your secondary kernel may overwrite critical memory > > regions without which the secondary kernel cannot boot (the GICv3 LPI > > tables being a prime example of such reserved regions). > > > > It turns out that there is at least two ways for reserved memory > > regions to be described to kexec: /proc/iomem for the userspace > > implementation, and memblock.reserved for kexec_file. And of course, > > our LPI tables are only reserved using the resource tree, leading to > > the aforementioned stamping. Similar things could happen with ACPI > > tables as well. > > So which one of these (/proc/iomem and memblock.reserved) would be the > correct option? If none of them, is their intersection any better? /proc/iomem is what we use for userspace, so you'd expect this to be the right thing to use. > Looking at the default kexec_locate_mem_hole(), it uses the resources > tree if !CONFIG_ARCH_KEEP_MEMBLOCK, otherwise memblock. Yup, and funnily enough, forcing a fallback to the resources tree doesn't help either, because the logic used here isn't much better (it takes the RAM areas at face value, without excluding any of the reserved regions that are children of the "System RAM" regions). It's not funny anymore. > PowerPC implements its own arch_kexec_locate_mem_hole() to skip specific > arch regions. We could do something similar for arm64 if the arch code > knows where the LPI reservation is or the ACPI tables. It feels like a bit of a failure to duplicate all that code. I'd consider that the last possible outcome. > If we conclude that we need some intersection of resource reservations > and memblock, maybe we should change the default kexec_locate_mem_hole() > implementation to check for both (e.g. start with the resource tree and > only consider a range valid if not in memblock.reserved). I am more angling towards this. But my worry is that different architectures have already different ways to reserve memory (PPC seems to do their own stuff on top of memblock, x86 I assumes uses the resource tree in a different way than arm64). Anyway, I'll keep digging. M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec