[Crash-utility] Re: [PATCH v4 0/5] Improve stack unwind on ppc64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Lianbo,

On Mon, Dec 18, 2023 at 09:16:30PM +0800, Lianbo Jiang wrote:
> On 12/15/23 21:26, Aditya Gupta wrote:
> 
> > Hello Lianbo and Tao,
> > 
> > On Fri, Dec 15, 2023 at 01:29:48PM +0530, Aditya Gupta wrote:
> > > ... <snip> ...
> > > 
> > > Known Issues:
> > > =============
> > > 
> > > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected
> > >     from older kernels. This is a known issue due to register mismatch, and
> > >     its fix has been merged upstream:
> > > 
> > >     This can also cause some 'invalid kernel virtual address' errors during gdb
> > >     unwinding the stack registers
> > > 
> > > Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79
> > > 
> > Regarding these backtrace you posted earlier with the invalid address errors:
> > 
> >      crash> gdb bt
> >      #0  0xc000000000281298 in crash_setup_regs (gdb: invalid kernel virtual
> >      address: fffffffffffffffb  type: "gdb_readmem callback"
> >      gdb: invalid kernel virtual address: fffffffffffffff7  type: "gdb_readmem
> >      callback"
> >      gdb: invalid kernel virtual address: fffffffffffffff3  type: "gdb_readmem
> >      callback"
> >      gdb: invalid kernel virtual address: fffffffffffffffb  type: "gdb_readmem
> >      callback"
> >      gdb: invalid kernel virtual address: fffffffffffffff7  type: "gdb_readmem
> >      callback"
> >      gdb: invalid kernel virtual address: fffffffffffffff3  type: "gdb_readmem
> >      callback"
> >      oldregs=<optimized out>, newregs=0xc00000000c0f7908) at
> >      ./arch/powerpc/include/asm/kexec.h:69
> >      #1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:975
> >      #2  0xfffffffffffffffb in ?? ()
> >      Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> > 
> > To identify that this is due to register mismatch issue in the kernel, and print
> > a warning, I found this code snippet in 'drgn' tool's code [1]:
> > 
> > 	// In most cases, nip (word 32) contains the program counter. But, the
> > 	// NT_PRSTATUS note in Linux kernel vmcores before Linux kernel commit
> > 	// b684c09f09e7 ("powerpc: update ppc_save_regs to save current r1 in
> > 	// pt_regs") (in v6.5) is odd, and the saved stack pointer (r1) is for
> > 	// the program counter in the link register (word 36). The fix was also
> > 	// backported to several stable branches. Unfortunately, there's no good
> > 	// way to detect it other than the kernel version.
> > 	bool r1_is_for_lr = linux_kernel_prstatus;
> > 	if (linux_kernel_prstatus) {
> > 		char *p = (char *)prog->vmcoreinfo.osrelease;
> > 		long major = strtol(p, &p, 10), minor = 0, patch = 0;
> > 		if (*p == '.') {
> > 			minor = strtol(p + 1, &p, 10);
> > 			if (*p == '.')
> > 				patch = strtol(p + 1, NULL, 10);
> > 		}
> > 		if (major > 6 || (major == 6 && minor >= 5))
> > 			r1_is_for_lr = false;
> > 		// Commit cc46085350ccae5f3a2a55a48ab93ebf328d5e24 in v6.4.4.
> > 		if (major == 6 && minor == 4 && patch >= 4)
> > 			r1_is_for_lr = false;
> > 		// Commit ca9465056e1a40ec0b729c115871b1b17755b631 in v6.3.13.
> > 		if (major == 6 && minor == 3 && patch >= 13)
> > 			r1_is_for_lr = false;
> > 		// Commit 865d128cab0ded06c41b06cfdc191ef3d121a95f in v6.1.39.
> > 		if (major == 6 && minor == 1 && patch >= 39)
> > 			r1_is_for_lr = false;
> > 		// Commit 3786416e1fa2ec491b25a0bae6deec163a8795d1 in v5.15.121.
> > 		if (major == 5 && minor == 15 && patch >= 121)
> > 			r1_is_for_lr = false;
> > 	}
> > 
> > The kernel version is the parameter they are using to see if the issue is there,
> > don't know how reliable it is, but should I implement this way and print a
> 
> In crash tool, sometimes we use the following code to check if the issue
> exists, for example:
> 
> if (THIS_KERNEL_VERSION >= LINUX(6,5,0))
> 
>     xxx
> 
> However, we do not recommend it in crash tool unless there is no better way.
> 

Yes, I tried to think for alternative ways, but doesn't seem to be a concrete
way I can say that the issue exists.

Maybe first I will improve the patch series per your reviews and then see how
to handle this.

Thanks,
Aditya Gupta

> 
> Thanks.
> 
> Lianbo
> 
> > warning ?
> > 
> > [1] https://github.com/osandov/drgn/blob/a7f9db306764132db3c6344110a50510a7f8911b/libdrgn/arch_ppc64.c#L96
> > 
> > Thanks,
> > Aditya Gupta
> > 
> 
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux