[RFC PATCH 0/5] Avoid kdump service reload on CPU hotplug events

Sourabh Jain <sourabhjain@xxxxxxxxxxxxx> · Mon, 21 Feb 2022 14:16:19 +0530

On hotplug event (CPU/memory) the CPU information prepared for the kdump kernel
becomes stale unless it is prepared again. To keep the CPU information
up-to-date a kdump service reload is triggered via the udev rule.

The above approach has two downsides:

1) The udev rules are prone to races if hotplug event is frequent. The time is
   taken to settle down all the kdump service reload requested is significant
   when multiple CPU/memory hotplug is performed at the same time. This creates
   a window where kernel crash might not lead to successfully dump collection.

2) Unnecessary CPU cycles are consumed to reload all the kdump components
   including initrd, vmlinux, FDT, etc. whereas only one component needs to
   update that is FDT.

How this patch series solve the above issue?
--------------------------------------------
As mentioned above the only kexec segment that gets updated during
the kdump service reload (due to hotplug event) is FDT. So, instead
of re-creating the FDT on every hotplug event, it is just created
once and updated on every hotplug event. This FDT is referred as kexec
crash FDT.

How kexec crash FDT is managed?
-------------------------------
During the kernel boot, a hole is allocated for kexec crash FDT in the kdump
reserved region. On kdump service start a fresh copy of kdump FDT
(created by kexec tool or kernel-based on which system call is used) is
copied to the pre-allocated hole for kexec crash FDT. Once a kexec crash
FDT is loaded all the subsequent updates needed due to CPU hot-add event
can directly be done to kexec crash FDT without reloading all the kexec
segment again. A hook is added on the CPU hot-add path to update the kexec
crash FDT.

How kexec crash FDT is accessed in kexec_load and kexec_file_load system call?
------------------------------------------------------------------------------
Since kexec_file_load creates all kexec segments are prepared in the kernel,
it can easily access the kexec crash FDT with help of two global variables,
that holds the start address and the size of the kexec crash FDT.

In kexec_load system call, the kexec segments are prepared by the kexec tool in
userspace. The start address and the size of kexec crash fdt is provided to
userspace via two sysfs files /sys/kernel/kexec_crash_fdt and
/sys/kernel/kexec_crash_fdt_size.

A couple of minor changes are required to realise the benefit of the patch
series:

- disalble the udev rule:

  comment out the below line in kdump udev rule file:
  RHEL: /usr/lib/udev/rules.d/98-kexec.rules
  # SUBSYSTEM=="cpu", ACTION=="online", GOTO="kdump_reload_cpu"

- kexec tool needs to be updated with patch for kexec_load system call
  to work (not needed if -s option is used during kexec panic load):

---