On 04/05/23 3:46 am, Eric DeVolder wrote:
When the kdump service is loaded, if a CPU or memory is hot un/plugged, the crash elfcorehdr, which describes the CPUs and memory in the system, must also be updated, else the resulting vmcore is inaccurate (eg. missing either CPU context or memory regions). The current solution (eg. RHEL /usr/lib/udev/rules.d/98-kexec.rules) utilizes udev to initiate an unload-then-reload of the *entire* kdump image (eg. kernel, initrd, boot_params, purgatory and elfcorehdr) by the userspace kexec utility. In a previous kernel patch post I have outlined the significant performance problems related to offloading this activity to userspace. As such, I've been working to provide the ability for the Linux kernel to directly modify the elfcorehdr in response to hotplug changes. https://lore.kernel.org/lkml/20230404180326.6890-1-eric.devolder@xxxxxxxxxx/ The series listed above is v21, and the v22 contains changes that work in concert with the v2 changes cited within. (I'm posting the kexec-tools changes first so I can reference them in the kernel v22 posting.) I believe this work to be nearing the finish line. As such, I'd like to start posting the kexec-tools userspace changes for review in order to minimize the time to adoption. This kexec-tools patch series is for supporting the kexec_load syscall only. The kernel patch series cited above is self-contained for the kexec_file_load syscall, requiring no userspace help. There are two basic obstacles/requirements for the kexec-tools to overcome in order to support kernel hotplug rewriting of the elfcorehdr. First, the buffer containing the elfcorehdr must be excluded from the purgatory checksum/digest, which is computed at load time. Otherwise kernel run-time changes to the elfcorehdr, as a result of hot un/plug, would result in the checksum failing (specifically in purgatory at panic kernel boot time), and kdump capture kernel failing to start. To let the kernel know it is okay to modify the elfcorehdr, kexec sets the KEXEC_UPDATE_ELFCOREHDR flag. NOTE: The kernel specifically does *NOT* attempt to recompute the checksum/digest as that would ultimately require patching the in- memory purgatory image with the updated checksum. As that purgatory image is already fully linked, it is binary blob containing no ELF information which would allow it to be re-linked or patched. Thus excluding the elfcorehdr from the checksum/digests avoids all these problems. Second, the size of the elfcorehdr buffer must be large enough to accomodate growth of the number of CPUs and/or memory regions. To satisfy the first requirement, this patch series introduces the --hotplug option to indicate to kexec-tools that kexec should exclude the elfcorehdr buffer from the purgatory checksum/digest calculation and set the KEXEC_UPDATE_ELFCOREHDR flag. To satisfy the second requirement, the size is obtained from the (proposed in the kernel series above) /sys/kernel/crash_elfcorehdr_size node, or it can be specified manually with new --elfcorehdrsz= option. I am intentionally posting this series before the kernel changes have been merged. I'm hoping to facilitate discussion as to how kexec-tools wants to handle the soon-to-be new kernel feature. Discussion items: - It is worth noting, that deploying kexec-tools, with this series included, on kernels that do NOT have the kernel hotplug series cited above, is safe to do. The result of running a kernel without hotplug elfcorehdr support with kexec-tools and the --hotplug option simply removes the elfcorehdr buffer from the digest. This does not prevent kdump from operating; the only risk being a slight chance of corruption of the elfcorehdr, as it now not covered by the checksum. Using the --elfcorehdrsz option on a kernel without hotplug elfcorehdr support simply results in a possibly oversized buffer for the elfcorehdr, there is no harm in that. - While I currently have the --hotplug as an option, the option could be eliminated (or reversed polarity) it would be safe to *always* omit the elfcorehdr from the checksum/digest for purgatory. If this were the case, then distros would not have to make any changes to kdump scripts to pass the --hotplug option. Then, when their kernel does include the kernel patch series cited above, kdump and hotplug would "just work". - I'm unsure if these options should be kept as common/global kexec options, or moved to arch options. - I'm only showing x86 support (and testing) at this time, but it would be straight forward to provide similar support for the other architectures in a future patch revision.
True. Should be straightforward to add similar support for other architectures. For example, powerpc would need another flag KEXEC_UPDATE_FDT on top of the flag to update elfcorehdr. Looks good to me. For the series.. Acked-by: Hari Bathini <hbathini@xxxxxxxxxxxxx>
Thanks! eric --- v2: 3may2023 - Setting KEXEC_UPDATE_ELFCOREHDR flag - Utilizing /sys/kernel/crash_elfcorehdr_size info. v1: 20oct2022 http://lists.infradead.org/pipermail/kexec/2022-October/026032.html - Initial patch series RFC: https://lore.kernel.org/lkml/b04ed259-dc5f-7f30-6661-c26f92d9096a@xxxxxxxxxx/ s/vmcoreinfo/elfcorehdr/g --- Eric DeVolder (6): kexec: define KEXEC_UPDATE_ELFCOREHDR crashdump: introduce the hotplug command line options crashdump: setup hotplug support crashdump: exclude elfcorehdr segment from digest for hotplug crashdump/x86: identify elfcorehdr segment for hotplug crashdump/x86: set the elfcorehdr segment size for hotplug kexec/arch/i386/crashdump-x86.c | 8 ++++++ kexec/kexec-syscall.h | 1 + kexec/kexec.c | 45 +++++++++++++++++++++++++++++++++ kexec/kexec.h | 10 +++++++- 4 files changed, 63 insertions(+), 1 deletion(-)
_______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec