[Crash-utility] Re: [PATCH v9 1/6] ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tao,

On Tue, Mar 05, 2024 at 07:39:00PM +0800, Tao Liu wrote:
> Hi Aditya,
> 
> On Tue, Mar 5, 2024 at 6:30 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote:
> >
> > Sorry Tao, I missed this mail earlier, as it got marked as read somehow.
> >
> > On Tue, Feb 27, 2024 at 05:37:05PM +0800, Tao Liu wrote:
> > > On Thu, Feb 22, 2024 at 10:52:56AM +0530, Aditya Gupta wrote:
> > > >
> > > > <...snip...>
> > > >
> > > > +static int
> > > > +ppc64_get_cpu_reg(int cpu, int regno, const char *name, int size,
> > > > +             void *value)
> > > > +{
> > > > +   struct bt_info bt_info, bt_setup;
> > > > +   struct task_context *tc;
> > > > +   ulong task;
> > > > +   struct ppc64_pt_regs *pt_regs;
> > > > +   ulong ip, sp;
> > > > +
> > > > +   if (LIVE()) {
> > > > +           /* doesn't support reading registers in live dump */
> > > > +           return FALSE;
> > > > +   }
> > >
> > > What do you think if we remove the LIVE() check for v10? According to my test(with my x86_64 patchsets), with ppc64 arbitrary task stack unwinding support, we can view the stack unwinding for non-active tasks for live debug:
> > >
> > > crash> sys
> > >       KERNEL: /usr/lib/debug/lib/modules/5.14.0-425.el9.ppc64le/vmlinux
> > >     DUMPFILE: /proc/kcore
> > >         CPUS: 8
> > >         DATE: Tue Feb 27 04:19:36 EST 2024
> > >       UPTIME: 01:11:56
> > > LOAD AVERAGE: 0.26, 0.09, 1.70
> > >        TASKS: 196
> > >     NODENAME: ibm-p9z-26-lp13.virt.pnr.lab.eng.rdu2.redhat.com
> > >      RELEASE: 5.14.0-425.el9.ppc64le
> > >      VERSION: #1 SMP Wed Feb 21 15:29:04 EST 2024
> > >      MACHINE: ppc64le  (2900 Mhz)
> > >       MEMORY: 8 GB
> > > crash> set 1
> > >     PID: 1
> > > COMMAND: "systemd"
> > >    TASK: c0000000035fc900  [THREAD_INFO: c0000000035fc900]
> > >     CPU: 1
> > >   STATE: TASK_INTERRUPTIBLE
> > > crash> bt
> > > PID: 1        TASK: c0000000035fc900  CPU: 1    COMMAND: "systemd"
> > >  #0 [c00000000369fa60] __schedule at c000000000fc3c58
> > >  #1 [c00000000369fb20] schedule at c000000000fc411c
> > >  #2 [c00000000369fb50] schedule_hrtimeout_range_clock at c000000000fcd2a4
> > >  #3 [c00000000369fc00] ep_poll at c00000000063640c
> > >  #4 [c00000000369fcf0] do_epoll_wait at c000000000636584
> > >  #5 [c00000000369fd40] sys_epoll_wait at c000000000636608
> > >  #6 [c00000000369fdb0] system_call_exception at c00000000002e994
> > >  #7 [c00000000369fe10] system_call_vectored_common at c00000000000bfe8
> > > crash> gdb bt
> > > #0  0xc000000000fc3c58 in context_switch (rf=0xc00000000369fac0, next=0x0, prev=0x0, rq=0x0) at kernel/sched/core.c:5409
> > > #1  __schedule (sched_mode=sched_mode@entry=0) at kernel/sched/core.c:6737
> > > #2  0xc000000000fc411c in schedule_loop (sched_mode=<optimized out>) at kernel/sched/core.c:6807
> > > #3  schedule () at kernel/sched/core.c:6816
> > > #4  0xc000000000fcd2a4 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock_id=<optimized out>) at kernel/time/hrtimer.c:2297
> > > #5  0xc00000000063640c in ep_poll (ep=0xc000000034b32e80, events=0x0, events@entry=0x1001d23a9e0, maxevents=maxevents@entry=82, timeout=timeout@entry=0x0) at fs/eventpoll.c:1904
> > > #6  0xc000000000636584 in do_epoll_wait (epfd=epfd@entry=4, events=events@entry=0x1001d23a9e0, maxevents=maxevents@entry=82, to=0x0) at fs/eventpoll.c:2288
> > > #7  0xc000000000636608 in __do_sys_epoll_wait (timeout=<optimized out>, maxevents=82, events=0x1001d23a9e0, epfd=4) at fs/eventpoll.c:2300
> > > #8  __se_sys_epoll_wait (epfd=4, events=1100000504288, maxevents=82, timeout=<optimized out>) at fs/eventpoll.c:2295
> > > #9  0xc00000000002e994 in system_call_exception (r3=4, r4=1100000504288, r5=82, r6=-1, r7=140735513318112, r8=140735513287592, r0=<optimized out>, regs=0xc00000000369fe80) at arch/powerpc/kernel/interrupt.c:221
> > > #10 0xc00000000000bfe8 in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:209
> > > Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> > >
> > > However for those active tasks in live mode, stack unwind will fail, but I think it doesn't matter because "bt" and "gdb bt" both cannot show stacks:
> >
> > Seems good. Currently, I marked it as 'not supported in live debug'
> > since we only add 1 thread for CPU 0, and don't add any gdb thread for
> > other CPUs. So, setting gdb context fails (as there is no thread).
> >
> > I see in your patches you are adding gdb thread, if the thread is not
> > there. After that it should work.
> >
> > Can you remove the LIVE() check along with your patches itself ?
> > Even if I remove it now, this series still doesn't support this feature
> > in live debug, and will introduce a silent bug, just showing same
> > backtrace for all CPUs in live debug (THREAD 1, CPU 0).
> 
> OK, seems reasonable. I will remove the LIVE() check in my patchset.

Sure, Thanks Tao !

- Aditya Gupta

> 
> Thanks,
> Tao Liu
> 
> >
> > Thanks,
> > Aditya Gupta
> >
> > >
> > > crash> ps
> > >       PID    PPID  CPU       TASK        ST  %MEM      VSZ      RSS  COMM
> > >         0       0   0  c000000002af6380  RU   0.0        0        0  [swapper/0]
> > > >       0       0   1  c0000000035f9000  RU   0.0        0        0  [swapper/1]
> > > >       0       0   2  c0000000035f0180  RU   0.0        0        0  [swapper/2]
> > > >       0       0   3  c0000000035c5580  RU   0.0        0        0  [swapper/3]
> > > >       0       0   4  c0000000035fac80  RU   0.0        0        0  [swapper/4]
> > > >       0       0   5  c0000000035c7200  RU   0.0        0        0  [swapper/5]
> > > >       0       0   6  c0000000035c0000  RU   0.0        0        0  [swapper/6]
> > > >       0       0   7  c0000000035dc800  RU   0.0        0        0  [swapper/7]
> > >         1       0   1  c0000000035fc900  IN   0.2   182272    14016  systemd
> > > crash> set c0000000035f0180
> > >     PID: 0
> > > COMMAND: "swapper/2"
> > >    TASK: c0000000035f0180  (1 of 8)  [THREAD_INFO: c0000000035f0180]
> > >     CPU: 2
> > >   STATE: TASK_RUNNING (ACTIVE)
> > > crash> bt
> > > PID: 0        TASK: c0000000035f0180  CPU: 2    COMMAND: "swapper/2"
> > > (active)
> > > crash> gdb bt
> > > #0  0xc000000003847d50 in ?? ()
> > > gdb: invalid kernel virtual address: fffffffffffffffc  type: "gdb_readmem callback"
> > > gdb: invalid kernel virtual address: fffffffffffffff8  type: "gdb_readmem callback"
> > > gdb: invalid kernel virtual address: fffffffffffffffc  type: "gdb_readmem callback"
> > > gdb: invalid kernel virtual address: fffffffffffffff8  type: "gdb_readmem callback"
> > > #1  0x0000000000000000 in ?? ()
> > >
> > > What do you think?
> > >
> > > Thanks,
> > > Tao Liu
> > > > +
> > > > +   /* Currently only handling registers available in ppc64_pt_regs:
> > > > +    *
> > > > +    * 0-31:   r0-r31
> > > > +    * 64:     pc/nip
> > > > +    * 65:     msr
> > > > +    *
> > > > +    * 67:     lr
> > > > +    * 68:     ctr
> > > > +    */
> > > > +   switch (regno) {
> > > > +   case PPC64_R0_REGNUM ... PPC64_R31_REGNUM:
> > > > +
> > > > +   case PPC64_PC_REGNUM:
> > > > +   case PPC64_MSR_REGNUM:
> > > > +   case PPC64_LR_REGNUM:
> > > > +   case PPC64_CTR_REGNUM:
> > > > +           break;
> > > > +
> > > > +   default:
> > > > +           // return false if we can't get that register
> > > > +           if (CRASHDEBUG(1))
> > > > +                   error(WARNING, "unsupported register, regno=%d\n", regno);
> > > > +           return FALSE;
> > > > +   }
> > > > +
> > > > +   task = get_active_task(cpu);
> > > > +   tc = task_to_context(task);
> > > > +   BZERO(&bt_setup, sizeof(struct bt_info));
> > > > +   clone_bt_info(&bt_setup, &bt_info, tc);
> > > > +   fill_stackbuf(&bt_info);
> > > > +
> > > > +   // reusing the get_dumpfile_regs function to get pt regs structure
> > > > +   get_dumpfile_regs(&bt_info, &sp, &ip);
> > > > +   pt_regs = (struct ppc64_pt_regs *)bt_info.machdep;
> > > > +
> > > > +   if (!pt_regs) {
> > > > +           error(WARNING, "pt_regs not available for cpu %d\n", cpu);
> > > > +           return FALSE;
> > > > +   }
> > > > +
> > > > +   switch (regno) {
> > > > +   case PPC64_R0_REGNUM ... PPC64_R31_REGNUM:
> > > > +           if (size != sizeof(pt_regs->gpr[regno]))
> > > > +                   return FALSE;  // size mismatch
> > > > +
> > > > +           memcpy(value, &pt_regs->gpr[regno], size);
> > > > +           break;
> > > > +
> > > > +   case PPC64_PC_REGNUM:
> > > > +           if (size != sizeof(pt_regs->nip))
> > > > +                   return FALSE;  // size mismatch
> > > > +
> > > > +           memcpy(value, &pt_regs->nip, size);
> > > > +           break;
> > > > +
> > > > +   case PPC64_MSR_REGNUM:
> > > > +           if (size != sizeof(pt_regs->msr))
> > > > +                   return FALSE;  // size mismatch
> > > > +
> > > > +           memcpy(value, &pt_regs->msr, size);
> > > > +           break;
> > > > +
> > > > +   case PPC64_LR_REGNUM:
> > > > +           if (size != sizeof(pt_regs->link))
> > > > +                   return FALSE;  // size mismatch
> > > > +
> > > > +           memcpy(value, &pt_regs->link, size);
> > > > +           break;
> > > > +
> > > > +   case PPC64_CTR_REGNUM:
> > > > +           if (size != sizeof(pt_regs->ctr))
> > > > +                   return FALSE;  // size mismatch
> > > > +
> > > > +           memcpy(value, &pt_regs->ctr, size);
> > > > +           break;
> > > > +   }
> > > > +
> > > > +   /* free buffer allocated by fill_stackbuf */
> > > > +   if (bt_info.stackbuf)
> > > > +           FREEBUF(bt_info.stackbuf);
> > > > +
> > > > +   return TRUE;
> > > > +}
> > > > +
> > > >  /*
> > > >   * For vmcore typically saved with KDump or FADump, get SP and IP values
> > > >   * from the saved ptregs.
> > > > @@ -2613,9 +2716,11 @@ ppc64_get_dumpfile_stack_frame(struct bt_info *bt_in, ulong *nip, ulong *ksp)
> > > >             pt_regs = (struct ppc64_pt_regs *)bt->machdep;
> > > >             ur_nip = pt_regs->nip;
> > > >             ur_ksp = pt_regs->gpr[1];
> > > > -           /* Print the collected regs for panic task. */
> > > > -           ppc64_print_regs(pt_regs);
> > > > -           ppc64_print_nip_lr(pt_regs, 1);
> > > > +           if (!(bt->flags & BT_NO_PRINT_REGS)) {
> > > > +                   /* Print the collected regs for panic task. */
> > > > +                   ppc64_print_regs(pt_regs);
> > > > +                   ppc64_print_nip_lr(pt_regs, 1);
> > > > +           }
> > > >     } else if ((pc->flags & KDUMP) ||
> > > >                ((pc->flags & DISKDUMP) &&
> > > >                 (*diskdump_flags & KDUMP_CMPRS_LOCAL))) {
> > > > --
> > > > 2.43.0
> > > >
> > >
> >
> 
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux