Hello Lianbo and Tao, On Fri, Dec 15, 2023 at 01:29:48PM +0530, Aditya Gupta wrote: > > ... <snip> ... > > Known Issues: > ============= > > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected > from older kernels. This is a known issue due to register mismatch, and > its fix has been merged upstream: > > This can also cause some 'invalid kernel virtual address' errors during gdb > unwinding the stack registers > > Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79 > Regarding these backtrace you posted earlier with the invalid address errors: crash> gdb bt #0 0xc000000000281298 in crash_setup_regs (gdb: invalid kernel virtual address: fffffffffffffffb type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff7 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff3 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffffb type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff7 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff3 type: "gdb_readmem callback" oldregs=<optimized out>, newregs=0xc00000000c0f7908) at ./arch/powerpc/include/asm/kexec.h:69 #1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:975 #2 0xfffffffffffffffb in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) To identify that this is due to register mismatch issue in the kernel, and print a warning, I found this code snippet in 'drgn' tool's code [1]: // In most cases, nip (word 32) contains the program counter. But, the // NT_PRSTATUS note in Linux kernel vmcores before Linux kernel commit // b684c09f09e7 ("powerpc: update ppc_save_regs to save current r1 in // pt_regs") (in v6.5) is odd, and the saved stack pointer (r1) is for // the program counter in the link register (word 36). The fix was also // backported to several stable branches. Unfortunately, there's no good // way to detect it other than the kernel version. bool r1_is_for_lr = linux_kernel_prstatus; if (linux_kernel_prstatus) { char *p = (char *)prog->vmcoreinfo.osrelease; long major = strtol(p, &p, 10), minor = 0, patch = 0; if (*p == '.') { minor = strtol(p + 1, &p, 10); if (*p == '.') patch = strtol(p + 1, NULL, 10); } if (major > 6 || (major == 6 && minor >= 5)) r1_is_for_lr = false; // Commit cc46085350ccae5f3a2a55a48ab93ebf328d5e24 in v6.4.4. if (major == 6 && minor == 4 && patch >= 4) r1_is_for_lr = false; // Commit ca9465056e1a40ec0b729c115871b1b17755b631 in v6.3.13. if (major == 6 && minor == 3 && patch >= 13) r1_is_for_lr = false; // Commit 865d128cab0ded06c41b06cfdc191ef3d121a95f in v6.1.39. if (major == 6 && minor == 1 && patch >= 39) r1_is_for_lr = false; // Commit 3786416e1fa2ec491b25a0bae6deec163a8795d1 in v5.15.121. if (major == 5 && minor == 15 && patch >= 121) r1_is_for_lr = false; } The kernel version is the parameter they are using to see if the issue is there, don't know how reliable it is, but should I implement this way and print a warning ? [1] https://github.com/osandov/drgn/blob/a7f9db306764132db3c6344110a50510a7f8911b/libdrgn/arch_ppc64.c#L96 Thanks, Aditya Gupta -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki