[Crash-utility] Re: [PATCH v9 1/6] ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry Tao, I missed this mail earlier, as it got marked as read somehow.

On Tue, Feb 27, 2024 at 05:37:05PM +0800, Tao Liu wrote:
> On Thu, Feb 22, 2024 at 10:52:56AM +0530, Aditya Gupta wrote:
> >
> > <...snip...>
> >
> > +static int
> > +ppc64_get_cpu_reg(int cpu, int regno, const char *name, int size,
> > +		  void *value)
> > +{
> > +	struct bt_info bt_info, bt_setup;
> > +	struct task_context *tc;
> > +	ulong task;
> > +	struct ppc64_pt_regs *pt_regs;
> > +	ulong ip, sp;
> > +
> > +	if (LIVE()) {
> > +		/* doesn't support reading registers in live dump */
> > +		return FALSE;
> > +	}
> 
> What do you think if we remove the LIVE() check for v10? According to my test(with my x86_64 patchsets), with ppc64 arbitrary task stack unwinding support, we can view the stack unwinding for non-active tasks for live debug:
> 
> crash> sys
>       KERNEL: /usr/lib/debug/lib/modules/5.14.0-425.el9.ppc64le/vmlinux
>     DUMPFILE: /proc/kcore
>         CPUS: 8
>         DATE: Tue Feb 27 04:19:36 EST 2024
>       UPTIME: 01:11:56
> LOAD AVERAGE: 0.26, 0.09, 1.70
>        TASKS: 196
>     NODENAME: ibm-p9z-26-lp13.virt.pnr.lab.eng.rdu2.redhat.com
>      RELEASE: 5.14.0-425.el9.ppc64le
>      VERSION: #1 SMP Wed Feb 21 15:29:04 EST 2024
>      MACHINE: ppc64le  (2900 Mhz)
>       MEMORY: 8 GB
> crash> set 1
>     PID: 1
> COMMAND: "systemd"
>    TASK: c0000000035fc900  [THREAD_INFO: c0000000035fc900]
>     CPU: 1
>   STATE: TASK_INTERRUPTIBLE
> crash> bt
> PID: 1        TASK: c0000000035fc900  CPU: 1    COMMAND: "systemd"
>  #0 [c00000000369fa60] __schedule at c000000000fc3c58
>  #1 [c00000000369fb20] schedule at c000000000fc411c
>  #2 [c00000000369fb50] schedule_hrtimeout_range_clock at c000000000fcd2a4
>  #3 [c00000000369fc00] ep_poll at c00000000063640c
>  #4 [c00000000369fcf0] do_epoll_wait at c000000000636584
>  #5 [c00000000369fd40] sys_epoll_wait at c000000000636608
>  #6 [c00000000369fdb0] system_call_exception at c00000000002e994
>  #7 [c00000000369fe10] system_call_vectored_common at c00000000000bfe8
> crash> gdb bt
> #0  0xc000000000fc3c58 in context_switch (rf=0xc00000000369fac0, next=0x0, prev=0x0, rq=0x0) at kernel/sched/core.c:5409
> #1  __schedule (sched_mode=sched_mode@entry=0) at kernel/sched/core.c:6737
> #2  0xc000000000fc411c in schedule_loop (sched_mode=<optimized out>) at kernel/sched/core.c:6807
> #3  schedule () at kernel/sched/core.c:6816
> #4  0xc000000000fcd2a4 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock_id=<optimized out>) at kernel/time/hrtimer.c:2297
> #5  0xc00000000063640c in ep_poll (ep=0xc000000034b32e80, events=0x0, events@entry=0x1001d23a9e0, maxevents=maxevents@entry=82, timeout=timeout@entry=0x0) at fs/eventpoll.c:1904
> #6  0xc000000000636584 in do_epoll_wait (epfd=epfd@entry=4, events=events@entry=0x1001d23a9e0, maxevents=maxevents@entry=82, to=0x0) at fs/eventpoll.c:2288
> #7  0xc000000000636608 in __do_sys_epoll_wait (timeout=<optimized out>, maxevents=82, events=0x1001d23a9e0, epfd=4) at fs/eventpoll.c:2300
> #8  __se_sys_epoll_wait (epfd=4, events=1100000504288, maxevents=82, timeout=<optimized out>) at fs/eventpoll.c:2295
> #9  0xc00000000002e994 in system_call_exception (r3=4, r4=1100000504288, r5=82, r6=-1, r7=140735513318112, r8=140735513287592, r0=<optimized out>, regs=0xc00000000369fe80) at arch/powerpc/kernel/interrupt.c:221
> #10 0xc00000000000bfe8 in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:209
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> 
> However for those active tasks in live mode, stack unwind will fail, but I think it doesn't matter because "bt" and "gdb bt" both cannot show stacks:

Seems good. Currently, I marked it as 'not supported in live debug'
since we only add 1 thread for CPU 0, and don't add any gdb thread for
other CPUs. So, setting gdb context fails (as there is no thread).

I see in your patches you are adding gdb thread, if the thread is not
there. After that it should work.

Can you remove the LIVE() check along with your patches itself ?
Even if I remove it now, this series still doesn't support this feature
in live debug, and will introduce a silent bug, just showing same
backtrace for all CPUs in live debug (THREAD 1, CPU 0).

Thanks,
Aditya Gupta

> 
> crash> ps
>       PID    PPID  CPU       TASK        ST  %MEM      VSZ      RSS  COMM
>         0       0   0  c000000002af6380  RU   0.0        0        0  [swapper/0]
> >       0       0   1  c0000000035f9000  RU   0.0        0        0  [swapper/1]
> >       0       0   2  c0000000035f0180  RU   0.0        0        0  [swapper/2]
> >       0       0   3  c0000000035c5580  RU   0.0        0        0  [swapper/3]
> >       0       0   4  c0000000035fac80  RU   0.0        0        0  [swapper/4]
> >       0       0   5  c0000000035c7200  RU   0.0        0        0  [swapper/5]
> >       0       0   6  c0000000035c0000  RU   0.0        0        0  [swapper/6]
> >       0       0   7  c0000000035dc800  RU   0.0        0        0  [swapper/7]
>         1       0   1  c0000000035fc900  IN   0.2   182272    14016  systemd
> crash> set c0000000035f0180
>     PID: 0
> COMMAND: "swapper/2"
>    TASK: c0000000035f0180  (1 of 8)  [THREAD_INFO: c0000000035f0180]
>     CPU: 2
>   STATE: TASK_RUNNING (ACTIVE)
> crash> bt
> PID: 0        TASK: c0000000035f0180  CPU: 2    COMMAND: "swapper/2"
> (active)
> crash> gdb bt
> #0  0xc000000003847d50 in ?? ()
> gdb: invalid kernel virtual address: fffffffffffffffc  type: "gdb_readmem callback"
> gdb: invalid kernel virtual address: fffffffffffffff8  type: "gdb_readmem callback"
> gdb: invalid kernel virtual address: fffffffffffffffc  type: "gdb_readmem callback"
> gdb: invalid kernel virtual address: fffffffffffffff8  type: "gdb_readmem callback"
> #1  0x0000000000000000 in ?? ()
> 
> What do you think?
> 
> Thanks,
> Tao Liu
> > +
> > +	/* Currently only handling registers available in ppc64_pt_regs:
> > +	 *
> > +	 * 0-31:   r0-r31
> > +	 * 64:     pc/nip
> > +	 * 65:     msr
> > +	 *
> > +	 * 67:     lr
> > +	 * 68:     ctr
> > +	 */
> > +	switch (regno) {
> > +	case PPC64_R0_REGNUM ... PPC64_R31_REGNUM:
> > +
> > +	case PPC64_PC_REGNUM:
> > +	case PPC64_MSR_REGNUM:
> > +	case PPC64_LR_REGNUM:
> > +	case PPC64_CTR_REGNUM:
> > +		break;
> > +
> > +	default:
> > +		// return false if we can't get that register
> > +		if (CRASHDEBUG(1))
> > +			error(WARNING, "unsupported register, regno=%d\n", regno);
> > +		return FALSE;
> > +	}
> > +
> > +	task = get_active_task(cpu);
> > +	tc = task_to_context(task);
> > +	BZERO(&bt_setup, sizeof(struct bt_info));
> > +	clone_bt_info(&bt_setup, &bt_info, tc);
> > +	fill_stackbuf(&bt_info);
> > +
> > +	// reusing the get_dumpfile_regs function to get pt regs structure
> > +	get_dumpfile_regs(&bt_info, &sp, &ip);
> > +	pt_regs = (struct ppc64_pt_regs *)bt_info.machdep;
> > +
> > +	if (!pt_regs) {
> > +		error(WARNING, "pt_regs not available for cpu %d\n", cpu);
> > +		return FALSE;
> > +	}
> > +
> > +	switch (regno) {
> > +	case PPC64_R0_REGNUM ... PPC64_R31_REGNUM:
> > +		if (size != sizeof(pt_regs->gpr[regno]))
> > +			return FALSE;  // size mismatch
> > +
> > +		memcpy(value, &pt_regs->gpr[regno], size);
> > +		break;
> > +
> > +	case PPC64_PC_REGNUM:
> > +		if (size != sizeof(pt_regs->nip))
> > +			return FALSE;  // size mismatch
> > +
> > +		memcpy(value, &pt_regs->nip, size);
> > +		break;
> > +
> > +	case PPC64_MSR_REGNUM:
> > +		if (size != sizeof(pt_regs->msr))
> > +			return FALSE;  // size mismatch
> > +
> > +		memcpy(value, &pt_regs->msr, size);
> > +		break;
> > +
> > +	case PPC64_LR_REGNUM:
> > +		if (size != sizeof(pt_regs->link))
> > +			return FALSE;  // size mismatch
> > +
> > +		memcpy(value, &pt_regs->link, size);
> > +		break;
> > +
> > +	case PPC64_CTR_REGNUM:
> > +		if (size != sizeof(pt_regs->ctr))
> > +			return FALSE;  // size mismatch
> > +
> > +		memcpy(value, &pt_regs->ctr, size);
> > +		break;
> > +	}
> > +
> > +	/* free buffer allocated by fill_stackbuf */
> > +	if (bt_info.stackbuf)
> > +		FREEBUF(bt_info.stackbuf);
> > +
> > +	return TRUE;
> > +}
> > +
> >  /*
> >   * For vmcore typically saved with KDump or FADump, get SP and IP values
> >   * from the saved ptregs.
> > @@ -2613,9 +2716,11 @@ ppc64_get_dumpfile_stack_frame(struct bt_info *bt_in, ulong *nip, ulong *ksp)
> >  		pt_regs = (struct ppc64_pt_regs *)bt->machdep;
> >  		ur_nip = pt_regs->nip;
> >  		ur_ksp = pt_regs->gpr[1];
> > -		/* Print the collected regs for panic task. */
> > -		ppc64_print_regs(pt_regs);
> > -		ppc64_print_nip_lr(pt_regs, 1);
> > +		if (!(bt->flags & BT_NO_PRINT_REGS)) {
> > +			/* Print the collected regs for panic task. */
> > +			ppc64_print_regs(pt_regs);
> > +			ppc64_print_nip_lr(pt_regs, 1);
> > +		}
> >  	} else if ((pc->flags & KDUMP) ||
> >  		   ((pc->flags & DISKDUMP) &&
> >  		    (*diskdump_flags & KDUMP_CMPRS_LOCAL))) {
> > -- 
> > 2.43.0
> > 
> 
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux