[Crash-utility] Re: [PATCH v5 0/5] Improve stack unwind on ppc64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Tao,

On Mon, Dec 25, 2023 at 10:25:00AM +0800, Tao Liu wrote:
> Hi Aditya,
> 
> On Wed, Dec 20, 2023 at 4:52 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote:
> >
> > The Problem:
> > ============
> >
> > Currently crash is unable to show function arguments and local variables, as
> > gdb can do. And functionality for moving between frames ('up'/'down') is not
> > working in crash.
> >
> > Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs
> > 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to
> > gdb not getting the register values from `crash_target::fetch_registers`,
> > which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64
> >
> > Proposed Solution:
> > ==================
> >
> > Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64.
> > This way, "gdb mode in crash" will support this feature for both ELF and
> > kdump-compressed vmcore formats, while "gdb" would only have supported ELF
> > format
> >
> > This way other features of 'gdb', such as seeing
> > backtraces/registers/variables/arguments/local variables, moving up and
> > down stack frames, can be used with any ppc64 vmcore, irrespective of
> > being ELF format or kdump-compressed format.
> >
> > Note: This doesn't support live debugging on ppc64, since registers are not
> > available to be read
> >
> > Implications on Architectures:
> > ====================================
> >
> > No architecture other than PPC64 has been affected, other than in case of
> > 'frame' command
> >
> > As mentioned in patch #2, since frame will not be prohibited, so it will print:
> >
> >         crash> frame
> >         #0  <unavailable> in ?? ()
> >
> > Instead of before prohibited message:
> >
> >         crash> frame
> >         crash: prohibited gdb command: frame
> >
> > Major change will be in 'gdb mode' on PPC64, that it will print the frames, and
> > local variables, instead of failing with errors showing no frame, or showing
> > that couldn't get PC, it will be able to give all this information.
> >
> > Testing:
> > ========
> >
> > Git tree with this patch series applied:
> > https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v5-smaller-gran
> >
> > To test various gdb passthroughs:
> >
> >         (crash) set
> >         (crash) set gdb on
> >         gdb> thread
> >         gdb> bt
> >         gdb> info threads
> >         gdb> info threads
> >         gdb> info locals
> >         gdb> info variables irq_rover_lock
> >         gdb> info args
> >         gdb> thread 2
> >         gdb> set gdb off
> >         (crash) set
> >         (crash) set -c 6
> >         (crash) gdb thread
> >         (crash) bt
> >         (crash) gdb bt
> >         (crash) frame
> >         (crash) up
> >         (crash) down
> >         (crash) info locals
> >
> 
> It seems we cannot view arbitrary process stack unwinding right? I
> mean, "gdb thread X" or "gdb bt" can only view the process which is
> currently running on the specific CPU, but for other processes which
> are not occupying CPU, we cannot specify it by "gdb bt <pid>" or "gdb
> thread <pid>". However drgn can unwind stack on arbitrary processes:
> >>> prog.stack_trace(<pid>)[<frame_id>]["varibale_name"]

Yes, currently gdb features cannot be used with arbitrary PIDs, since GDB
doesn't know of all those threads.

GDB gets the details of threads, registers etc from a registered 'target'.
When we debug a vmcore and vmlinux with GDB (NOT gdb mode in crash), ie. with
`gdb <vmlinux-path> <vmcore-path>`, then the target is 'core_target'.
This target knows all the known PIDs in the vmcore.

While in gdb mode, the target we register with GDB is 'crash_target'. And in
'crash_target_init', we explicitly register only 'nr_cpus' threads, one for each
CPU. Hence it does not know of any other threads.

> 
> It seems crash cmd bt need to be improved in order to view arbitrary
> process stack unwinding like drgn right?

Yes, specifically 'crash_target' needs to be improved to support all those
arbitrary threads. crash bt can already print arbitrary PID backtraces, but
improving crash_target will allow printing arbitrary threads in gdb mode also.

Thanks,
Aditya Gupta

> 
> Thanks,
> Tao Liu
> 
> 
> > Known Issues:
> > =============
> >
> > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected
> >    from older kernels. This is a known issue due to register mismatch, and
> >    its fix has been merged upstream:
> >
> >    This can also cause some 'invalid kernel virtual address' errors during gdb
> >    unwinding the stack registers
> >
> > Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79
> >
> > Fixing GDB passthroughs on other architectures
> > ==============================================
> >
> > Much of the work for making gdb passthroughs like 'gdb bt', 'gdb
> > thread', 'gdb info locals' etc. has been done by the patches introducing
> > 'machdep->get_cpu_reg' and this series fixing some issues in that.
> >
> > Other architectures should be able to fix these gdb functionalities by
> > simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'.
> >
> > The reasoning behind that has been explained with a diagram in commit
> > description of patch #1
> >
> > I will assist with my findings/observations fixing it on ppc64 whenever needed.
> >
> > Changelog:
> > ==========
> >
> > V5:
> > + changes in patch #1: made ppc64_get_cpu_reg static, and remove unreachable
> > code
> > + changes in patch #3: fixed typo 'ppc64_renum' instead of 'ppc64_regnum',
> > remove unneeded if condition
> > + changes in patch #5: implement refresh regcache on per thread, instead of all
> > threads at once
> >
> > V4:
> > + fix segmentation fault in live debugging (change in patch #1)
> > + mention live debugging not supported in cover letter and patch #1
> > + fixed some checkpatch warnings (change in patch #5)
> >
> > V3:
> > + default gdb thread will be the crashing thread, instead of being
> >   thread '0'
> > + synchronise crash cpu and gdb thread context
> > + fix bug in gdb_interface, that replaced gdb's output stream, losing
> >   output in some cases, such as info threads and extra output in info
> >   variables
> > + fix 'info threads'
> >
> > RFC V2:
> >   - removed patch implementing 'frame', 'up', 'down' in crash
> >   - updated the cover letter by removing the mention of those commands other
> >         than the respective gdb passthrough
> >
> > Aditya Gupta (5):
> >   ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
> >   remove 'frame' from prohibited commands list
> >   synchronise cpu context changes between crash/gdb
> >   fix gdb_interface: restore gdb's output streams at end of
> >     gdb_interface
> >   fix 'info threads' command
> >
> >  crash_target.c  |  44 ++++++++++++++++
> >  defs.h          | 130 +++++++++++++++++++++++++++++++++++++++++++++++-
> >  gdb-10.2.patch  | 110 +++++++++++++++++++++++++++++++++++++++-
> >  gdb_interface.c |   2 +-
> >  kernel.c        |  47 +++++++++++++++--
> >  ppc64.c         |  95 +++++++++++++++++++++++++++++++++--
> >  task.c          |  14 ++++++
> >  tools.c         |   2 +-
> >  8 files changed, 434 insertions(+), 10 deletions(-)
> >
> > --
> > 2.41.0
> >
> 
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux