On 05/03/23 at 06:41pm, Eric DeVolder wrote: > The hotplug support for kexec_load() requires coordination with > userspace, and therefore a little extra help from the kernel to > facilitate the coordination. > > In the absence of the solution contained within this particular > patch, if a kdump capture kernel is loaded via kexec_load() syscall, > then the crash hotplug logic would find the segment containing the > elfcorehdr, and upon a hotplug event, rewrite the elfcorehdr. While > generally speaking that is the desired behavior and outcome, a > problem arises from the fact that if the kdump image includes a > purgatory that performs a digest checksum, then that check would > fail (because the elfcorehdr was changed), and the capture kernel > would fail to boot and no kdump occur. > > Therefore, what is needed is for the userspace kexec-tools to > indicate to the kernel whether or not the supplied kdump image/ > elfcorehdr can be modified (because the kexec-tools excludes the > elfcorehdr from the digest, and sizes the elfcorehdr memory buffer > appropriately). > > To solve these problems, this patch introduces: > - a new kexec flag KEXEC_UPATE_ELFCOREHDR to indicate that it is > safe for the kernel to modify the elfcorehdr (because kexec-tools > has excluded the elfcorehdr from the digest). > - the /sys/kernel/crash_elfcorehdr_size node to communicate to > kexec-tools what the preferred size of the elfcorehdr memory buffer > should be in order to accommodate hotplug changes. > - The sysfs crash_hotplug nodes (ie. > /sys/devices/system/[cpu|memory]/crash_hotplug) are now dynamic in > that they examine kexec_file_load() vs kexec_load(), and when > kexec_load(), whether or not KEXEC_UPDATE_ELFCOREHDR is in effect. > This is critical so that the udev rule processing of crash_hotplug > indicates correctly (ie. the userspace unload-then-load of the > kdump of the kdump image can be skipped, or not). > > With this patch in place, I believe the following statements to be true > (with local testing to verify): > > - For systems which have these kernel changes in place, but not the > corresponding changes to the crash hot plug udev rules and > kexec-tools, (ie "older" systems) those systems will continue to > unload-then-load the kdump image, as has always been done. The > kexec-tools will not set KEXEC_UPDATE_ELFCOREHDR. > - For systems which have these kernel changes in place and the proposed > udev rule changes in place, but not the kexec-tools changes in place: > - the use of kexec_load() will not set KEXEC_UPDATE_ELFCOREHDR and > so the unload-then-reload of kdump image will occur (the sysfs > crash_hotplug nodes will show 0). > - the use of kexec_file_load() will permit sysfs crash_hotplug nodes > to show 1, and the kernel will modify the elfcorehdr directly. And > with the udev changes in place, the unload-then-load will not occur! > - For systems which have these kernel changes as well as the udev and > kexec-tools changes in place, then the user/admin has full authority > over the enablement and support of crash hotplug support, whether via > kexec_file_load() or kexec_load(). > > Said differently, as kexec_load() was/is widely in use, these changes > permit it to continue to be used as-is (retaining the current unload-then- > reload behavior) until such time as the udev and kexec-tools changes can > be rolled out as well. > > I've intentionally kept the changes related to userspace coordination > for kexec_load() separate as this need was identified late; the > rest of this series has been generally reviewed and accepted. Once > this support has been vetted, I can refactor if needed. > > Suggested-by: Hari Bathini <hbathini@xxxxxxxxxxxxx> > Signed-off-by: Eric DeVolder <eric.devolder@xxxxxxxxxx> LGTM, Acked-by: Baoquan He <bhe@xxxxxxxxxx> _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec