> The confusing part was talking about memory being still in use, > that is actually scheduled for use in the future. +1 > >>> Usually somewhere in the loaded image >>> is a copy of the memory map at the time the kexec kernel was loaded. >>> That will invalidate the memory map as well. >> >> Ah, unconditionally. Sure, x86 needs this. >> (arm64 re-discovers the memory map from firmware tables after kexec) Does this include hotplugged DIMMs e.g., under KVM? [...] >>> All of this should be for a very brief window of a few seconds, as >>> the loaded kexec image is quite short. >> >> It seems I'm the outlier anticipating anything could happen between >> those syscalls. > > The design is: > sys_kexec_load() > shutdown scripts > sys_reboot(LINUX_REBOOT_CMD_KEXEC); > > There are two system call simply so that the shutdown scripts can run. > Now maybe someone somewhere does something different but that is not > expected. > > Only the kexec on panic kernel is expected to persist somewhat > indefinitely. But that should be in memory that is reserved from boot > time, and so the memory hotplug should have enough visibility to not > allow that memory to be given up. Yes, and AFAIK, memory blocks which hold the reserved crashkernel area can usually not get offlined and, therefore, the memory cannot get removed. Interestingly, s390x even has a hotplug notifier for that arch/s390/kernel/setup.c:kdump_mem_notifier() (offlining of memory on s390x can result in memory getting depopulated in the hypervisor, so after it would have been offlined, it would no longer be accessible. I somewhat doubt that this notifier is really needed - all pages in the crashkernel area should look like ordinary allocated pages when the area is reserved early during boot via the memblock allocator, and therefore offlining cannot succeed. But that's a different story - and I suspect this is a leftover from pre-memblock times.) -- Thanks, David / dhildenb