Hi Aditya, On Tue, Mar 5, 2024 at 6:30 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote: > > Sorry Tao, I missed this mail earlier, as it got marked as read somehow. > > On Tue, Feb 27, 2024 at 05:37:05PM +0800, Tao Liu wrote: > > On Thu, Feb 22, 2024 at 10:52:56AM +0530, Aditya Gupta wrote: > > > > > > <...snip...> > > > > > > +static int > > > +ppc64_get_cpu_reg(int cpu, int regno, const char *name, int size, > > > + void *value) > > > +{ > > > + struct bt_info bt_info, bt_setup; > > > + struct task_context *tc; > > > + ulong task; > > > + struct ppc64_pt_regs *pt_regs; > > > + ulong ip, sp; > > > + > > > + if (LIVE()) { > > > + /* doesn't support reading registers in live dump */ > > > + return FALSE; > > > + } > > > > What do you think if we remove the LIVE() check for v10? According to my test(with my x86_64 patchsets), with ppc64 arbitrary task stack unwinding support, we can view the stack unwinding for non-active tasks for live debug: > > > > crash> sys > > KERNEL: /usr/lib/debug/lib/modules/5.14.0-425.el9.ppc64le/vmlinux > > DUMPFILE: /proc/kcore > > CPUS: 8 > > DATE: Tue Feb 27 04:19:36 EST 2024 > > UPTIME: 01:11:56 > > LOAD AVERAGE: 0.26, 0.09, 1.70 > > TASKS: 196 > > NODENAME: ibm-p9z-26-lp13.virt.pnr.lab.eng.rdu2.redhat.com > > RELEASE: 5.14.0-425.el9.ppc64le > > VERSION: #1 SMP Wed Feb 21 15:29:04 EST 2024 > > MACHINE: ppc64le (2900 Mhz) > > MEMORY: 8 GB > > crash> set 1 > > PID: 1 > > COMMAND: "systemd" > > TASK: c0000000035fc900 [THREAD_INFO: c0000000035fc900] > > CPU: 1 > > STATE: TASK_INTERRUPTIBLE > > crash> bt > > PID: 1 TASK: c0000000035fc900 CPU: 1 COMMAND: "systemd" > > #0 [c00000000369fa60] __schedule at c000000000fc3c58 > > #1 [c00000000369fb20] schedule at c000000000fc411c > > #2 [c00000000369fb50] schedule_hrtimeout_range_clock at c000000000fcd2a4 > > #3 [c00000000369fc00] ep_poll at c00000000063640c > > #4 [c00000000369fcf0] do_epoll_wait at c000000000636584 > > #5 [c00000000369fd40] sys_epoll_wait at c000000000636608 > > #6 [c00000000369fdb0] system_call_exception at c00000000002e994 > > #7 [c00000000369fe10] system_call_vectored_common at c00000000000bfe8 > > crash> gdb bt > > #0 0xc000000000fc3c58 in context_switch (rf=0xc00000000369fac0, next=0x0, prev=0x0, rq=0x0) at kernel/sched/core.c:5409 > > #1 __schedule (sched_mode=sched_mode@entry=0) at kernel/sched/core.c:6737 > > #2 0xc000000000fc411c in schedule_loop (sched_mode=<optimized out>) at kernel/sched/core.c:6807 > > #3 schedule () at kernel/sched/core.c:6816 > > #4 0xc000000000fcd2a4 in schedule_hrtimeout_range_clock (expires=<optimized out>, delta=<optimized out>, mode=<optimized out>, clock_id=<optimized out>) at kernel/time/hrtimer.c:2297 > > #5 0xc00000000063640c in ep_poll (ep=0xc000000034b32e80, events=0x0, events@entry=0x1001d23a9e0, maxevents=maxevents@entry=82, timeout=timeout@entry=0x0) at fs/eventpoll.c:1904 > > #6 0xc000000000636584 in do_epoll_wait (epfd=epfd@entry=4, events=events@entry=0x1001d23a9e0, maxevents=maxevents@entry=82, to=0x0) at fs/eventpoll.c:2288 > > #7 0xc000000000636608 in __do_sys_epoll_wait (timeout=<optimized out>, maxevents=82, events=0x1001d23a9e0, epfd=4) at fs/eventpoll.c:2300 > > #8 __se_sys_epoll_wait (epfd=4, events=1100000504288, maxevents=82, timeout=<optimized out>) at fs/eventpoll.c:2295 > > #9 0xc00000000002e994 in system_call_exception (r3=4, r4=1100000504288, r5=82, r6=-1, r7=140735513318112, r8=140735513287592, r0=<optimized out>, regs=0xc00000000369fe80) at arch/powerpc/kernel/interrupt.c:221 > > #10 0xc00000000000bfe8 in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:209 > > Backtrace stopped: previous frame inner to this frame (corrupt stack?) > > > > However for those active tasks in live mode, stack unwind will fail, but I think it doesn't matter because "bt" and "gdb bt" both cannot show stacks: > > Seems good. Currently, I marked it as 'not supported in live debug' > since we only add 1 thread for CPU 0, and don't add any gdb thread for > other CPUs. So, setting gdb context fails (as there is no thread). > > I see in your patches you are adding gdb thread, if the thread is not > there. After that it should work. > > Can you remove the LIVE() check along with your patches itself ? > Even if I remove it now, this series still doesn't support this feature > in live debug, and will introduce a silent bug, just showing same > backtrace for all CPUs in live debug (THREAD 1, CPU 0). OK, seems reasonable. I will remove the LIVE() check in my patchset. Thanks, Tao Liu > > Thanks, > Aditya Gupta > > > > > crash> ps > > PID PPID CPU TASK ST %MEM VSZ RSS COMM > > 0 0 0 c000000002af6380 RU 0.0 0 0 [swapper/0] > > > 0 0 1 c0000000035f9000 RU 0.0 0 0 [swapper/1] > > > 0 0 2 c0000000035f0180 RU 0.0 0 0 [swapper/2] > > > 0 0 3 c0000000035c5580 RU 0.0 0 0 [swapper/3] > > > 0 0 4 c0000000035fac80 RU 0.0 0 0 [swapper/4] > > > 0 0 5 c0000000035c7200 RU 0.0 0 0 [swapper/5] > > > 0 0 6 c0000000035c0000 RU 0.0 0 0 [swapper/6] > > > 0 0 7 c0000000035dc800 RU 0.0 0 0 [swapper/7] > > 1 0 1 c0000000035fc900 IN 0.2 182272 14016 systemd > > crash> set c0000000035f0180 > > PID: 0 > > COMMAND: "swapper/2" > > TASK: c0000000035f0180 (1 of 8) [THREAD_INFO: c0000000035f0180] > > CPU: 2 > > STATE: TASK_RUNNING (ACTIVE) > > crash> bt > > PID: 0 TASK: c0000000035f0180 CPU: 2 COMMAND: "swapper/2" > > (active) > > crash> gdb bt > > #0 0xc000000003847d50 in ?? () > > gdb: invalid kernel virtual address: fffffffffffffffc type: "gdb_readmem callback" > > gdb: invalid kernel virtual address: fffffffffffffff8 type: "gdb_readmem callback" > > gdb: invalid kernel virtual address: fffffffffffffffc type: "gdb_readmem callback" > > gdb: invalid kernel virtual address: fffffffffffffff8 type: "gdb_readmem callback" > > #1 0x0000000000000000 in ?? () > > > > What do you think? > > > > Thanks, > > Tao Liu > > > + > > > + /* Currently only handling registers available in ppc64_pt_regs: > > > + * > > > + * 0-31: r0-r31 > > > + * 64: pc/nip > > > + * 65: msr > > > + * > > > + * 67: lr > > > + * 68: ctr > > > + */ > > > + switch (regno) { > > > + case PPC64_R0_REGNUM ... PPC64_R31_REGNUM: > > > + > > > + case PPC64_PC_REGNUM: > > > + case PPC64_MSR_REGNUM: > > > + case PPC64_LR_REGNUM: > > > + case PPC64_CTR_REGNUM: > > > + break; > > > + > > > + default: > > > + // return false if we can't get that register > > > + if (CRASHDEBUG(1)) > > > + error(WARNING, "unsupported register, regno=%d\n", regno); > > > + return FALSE; > > > + } > > > + > > > + task = get_active_task(cpu); > > > + tc = task_to_context(task); > > > + BZERO(&bt_setup, sizeof(struct bt_info)); > > > + clone_bt_info(&bt_setup, &bt_info, tc); > > > + fill_stackbuf(&bt_info); > > > + > > > + // reusing the get_dumpfile_regs function to get pt regs structure > > > + get_dumpfile_regs(&bt_info, &sp, &ip); > > > + pt_regs = (struct ppc64_pt_regs *)bt_info.machdep; > > > + > > > + if (!pt_regs) { > > > + error(WARNING, "pt_regs not available for cpu %d\n", cpu); > > > + return FALSE; > > > + } > > > + > > > + switch (regno) { > > > + case PPC64_R0_REGNUM ... PPC64_R31_REGNUM: > > > + if (size != sizeof(pt_regs->gpr[regno])) > > > + return FALSE; // size mismatch > > > + > > > + memcpy(value, &pt_regs->gpr[regno], size); > > > + break; > > > + > > > + case PPC64_PC_REGNUM: > > > + if (size != sizeof(pt_regs->nip)) > > > + return FALSE; // size mismatch > > > + > > > + memcpy(value, &pt_regs->nip, size); > > > + break; > > > + > > > + case PPC64_MSR_REGNUM: > > > + if (size != sizeof(pt_regs->msr)) > > > + return FALSE; // size mismatch > > > + > > > + memcpy(value, &pt_regs->msr, size); > > > + break; > > > + > > > + case PPC64_LR_REGNUM: > > > + if (size != sizeof(pt_regs->link)) > > > + return FALSE; // size mismatch > > > + > > > + memcpy(value, &pt_regs->link, size); > > > + break; > > > + > > > + case PPC64_CTR_REGNUM: > > > + if (size != sizeof(pt_regs->ctr)) > > > + return FALSE; // size mismatch > > > + > > > + memcpy(value, &pt_regs->ctr, size); > > > + break; > > > + } > > > + > > > + /* free buffer allocated by fill_stackbuf */ > > > + if (bt_info.stackbuf) > > > + FREEBUF(bt_info.stackbuf); > > > + > > > + return TRUE; > > > +} > > > + > > > /* > > > * For vmcore typically saved with KDump or FADump, get SP and IP values > > > * from the saved ptregs. > > > @@ -2613,9 +2716,11 @@ ppc64_get_dumpfile_stack_frame(struct bt_info *bt_in, ulong *nip, ulong *ksp) > > > pt_regs = (struct ppc64_pt_regs *)bt->machdep; > > > ur_nip = pt_regs->nip; > > > ur_ksp = pt_regs->gpr[1]; > > > - /* Print the collected regs for panic task. */ > > > - ppc64_print_regs(pt_regs); > > > - ppc64_print_nip_lr(pt_regs, 1); > > > + if (!(bt->flags & BT_NO_PRINT_REGS)) { > > > + /* Print the collected regs for panic task. */ > > > + ppc64_print_regs(pt_regs); > > > + ppc64_print_nip_lr(pt_regs, 1); > > > + } > > > } else if ((pc->flags & KDUMP) || > > > ((pc->flags & DISKDUMP) && > > > (*diskdump_flags & KDUMP_CMPRS_LOCAL))) { > > > -- > > > 2.43.0 > > > > > > -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki