Re: [PATCH v19 6/7] crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Eric,

On 06/03/23 21:52, Eric DeVolder wrote:
The function crash_prepare_elf64_headers() generates the elfcorehdr
which describes the cpus and memory in the system for the crash kernel.
In particular, it writes out ELF PT_NOTEs for memory regions and the
cpus in the system.

With respect to the cpus, the current implementation utilizes
for_each_present_cpu() which means that as cpus are added and removed,
the elfcorehdr must again be updated to reflect the new set of cpus.

The reasoning behind the change to use for_each_possible_cpu(), is:

- At kernel boot time, all percpu crash_notes are allocated for all
   possible cpus; that is, crash_notes are not allocated dynamically
   when cpus are plugged/unplugged. Thus the crash_notes for each
   possible cpu are always available.

- The crash_prepare_elf64_headers() creates an ELF PT_NOTE per cpu.
   Changing to for_each_possible_cpu() is valid as the crash_notes
   pointed to by each cpu PT_NOTE are present and always valid.

Furthermore, examining a common crash processing path of:

  kernel panic -> crash kernel -> makedumpfile -> 'crash' analyzer
            elfcorehdr      /proc/vmcore     vmcore

reveals how the ELF cpu PT_NOTEs are utilized:

- Upon panic, each cpu is sent an IPI and shuts itself down, recording
  its state in its crash_notes. When all cpus are shutdown, the
  crash kernel is launched with a pointer to the elfcorehdr.

- The crash kernel via linux/fs/proc/vmcore.c does not examine or
  use the contents of the PT_NOTEs, it exposes them via /proc/vmcore.

- The makedumpfile utility uses /proc/vmcore and reads the cpu
  PT_NOTEs to craft a nr_cpus variable, which is reported in a
  header but otherwise generally unused. Makedumpfile creates the
  vmcore.

- The 'crash' dump analyzer does not appear to reference the cpu
  PT_NOTEs. Instead it looks-up the cpu_[possible|present|onlin]_mask
  symbols and directly examines those structure contents from vmcore
  memory. From that information it is able to determine which cpus
  are present and online, and locate the corresponding crash_notes.
  Said differently, it appears that 'crash' analyzer does not rely
  on the ELF PT_NOTEs for cpus; rather it obtains the information
  directly via kernel symbols and the memory within the vmcore.

(There maybe other vmcore generating and analysis tools that do use
these PT_NOTEs, but 'makedumpfile' and 'crash' seems to be the most
common solution.)

This change results in the benefit of having all cpus described in
the elfcorehdr, and therefore reducing the need to re-generate the
elfcorehdr on cpu changes, at the small expense of an additional
56 bytes per PT_NOTE for not-present-but-possible cpus.

On systems where kexec_file_load() syscall is utilized, all the above
is valid. On systems where kexec_load() syscall is utilized, there
may be the need for the elfcorehdr to be regenerated once. The reason
being that some archs only populate the 'present' cpus in the
/sys/devices/system/cpus entries, which the userspace 'kexec' utility
uses to generate the userspace-supplied elfcorehdr. In this situation,
one memory or cpu change will rewrite the elfcorehdr via the
crash_prepare_elf64_headers() function and now all possible cpus will
be described, just as with kexec_file_load() syscall.

Suggested-by: Sourabh Jain <sourabhjain@xxxxxxxxxxxxx>
Signed-off-by: Eric DeVolder <eric.devolder@xxxxxxxxxx>
---
  kernel/crash_core.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index dba4b75f7541..537b199a8774 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -365,7 +365,7 @@ int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map,
  	ehdr->e_phentsize = sizeof(Elf64_Phdr);
/* Prepare one phdr of type PT_NOTE for each present CPU */
We need to change this comment as well.
-	for_each_present_cpu(cpu) {
+	for_each_possible_cpu(cpu) {
- Sourabh Jain

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux