[Crash-utility] Re: [PATCH v9 0/5] Improve stack unwind on ppc64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Lianbo,

On Thu, Feb 22, 2024 at 05:16:31PM +0800, lijiang wrote:
> Hi, Aditya and Tao
> 
> Thank you for working on this.
> 
> I would suggest that the current feature can be splitted into several
> steps, for example:
> [1] complete a basic functionality on ppc64
> [2] improve it based on [1], such known issues
> [3] after finishing the [1](and [2]), Tao will implement it on X86 64
> 
> That can avoid confusion(or dependency) between ppc64 and X86 64 patches,
> it may save some time and make it easier to review patches.

Sure, actually I made a mistake in v9, other than that there should be
no dependency in this patch series.
This is the summary:

Patch #1: PPC64 specific
Patch #2 - #5: Architecture Independent, with patch #4 fixing an
existing bug in gdb_interface, this can be an independent patch.

So based on your suggestions, let's split into it as:

1. This patch series having Patch #1,2,3,5 (removed 4)
2. Fix for existing bug in gdb_interface: Patch #4, It will be good if
this can be merged soon, as testing 'info threads' will not work due to
the bug
3. Tao's patches for x86_64

Also, even though i have kept #2-#5 as architecture-independent, but
then they should be merged after the ppc patches, for their description
and usage to make sense, so I still kept them in the series.

Do let me know, I will send a v10 accordingly :)

Thanks,
Aditya Gupta

> 
> Thanks.
> Lianbo
> 
> On Thu, Feb 22, 2024 at 4:58 PM Tao Liu <ltao@xxxxxxxxxx> wrote:
> 
> > Hi Aditya,
> >
> > On Thu, Feb 22, 2024 at 1:23 PM Aditya Gupta <adityag@xxxxxxxxxxxxx>
> > wrote:
> > >
> > > The Problem:
> > > ============
> > >
> > > Currently crash is unable to show function arguments and local
> > variables, as
> > > gdb can do. And functionality for moving between frames ('up'/'down') is
> > not
> > > working in crash.
> > >
> > > Crash has 'gdb passthroughs' for things gdb can do, but the gdb
> > passthroughs
> > > 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to
> > > gdb not getting the register values from `crash_target::fetch_registers`,
> > > which then uses `machdep->get_cpu_reg`, which is not implemented for
> > PPC64
> > >
> > > Proposed Solution:
> > > ==================
> > >
> > > Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for
> > PPC64.
> > > This way, "gdb mode in crash" will support this feature for both ELF and
> > > kdump-compressed vmcore formats, while "gdb" would only have supported
> > ELF
> > > format
> > >
> > > This way other features of 'gdb', such as seeing
> > > backtraces/registers/variables/arguments/local variables, moving up and
> > > down stack frames, can be used with any ppc64 vmcore, irrespective of
> > > being ELF format or kdump-compressed format.
> > >
> > > Note: This doesn't support live debugging on ppc64, since registers are
> > not
> > > available to be read
> > >
> > > Implications on Architectures:
> > > ====================================
> > >
> > > No architecture other than PPC64 has been affected, other than in case of
> > > 'frame' command
> > >
> > > As mentioned in patch #2, since frame will not be prohibited, so it will
> > print:
> > >
> > >         crash> frame
> > >         #0  <unavailable> in ?? ()
> > >
> > > Instead of before prohibited message:
> > >
> > >         crash> frame
> > >         crash: prohibited gdb command: frame
> > >
> > > Major change will be in 'gdb mode' on PPC64, that it will print the
> > frames, and
> > > local variables, instead of failing with errors showing no frame, or
> > showing
> > > that couldn't get PC, it will be able to give all this information.
> > >
> > > Testing:
> > > ========
> > >
> > > Git tree with this patch series applied:
> > > https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v9
> >
> > I doubt the v9 patch will not work without my x86's trial patchset. I
> > see your repo, you directly applied my "ppc64 arbitrary task stack
> > unwind support" patch onto yours. However the patch has some
> > dependency on my x86 trial patchset.
> >
> > E.g. I added a new member "bool need_free" for defs.h:struct bt_info
> > in the "x86 unwind support" patch of mine, however you didn't pick the
> > one. And in ppc64.c:ppc64_get_stack_frame() and
> > ppc64.c:ppc64_get_cpu_reg(), the member will be used as:
> >
> > if (bt_info.need_free) {
> >      FREEBUF(pt_regs);
> >     bt_info.need_free = FALSE;
> > }
> >
> > So I guess (not tried yet) that your patchset v9 will not work.
> >
> > Currently I'm still struggling with some failing cases of x86_64
> > unwinding. So I didn't arrange my patchsets, along with the patch
> > commit log well, since they are all "trial" patches.
> >
> > I agree the patch "ppc64 arbitrary task stack unwind support" is
> > better to go with the ppc patch series. But I suggest we make some
> > modifications for it:
> >
> > 1) I'm OK with it being a stand alone patch, or merging the code
> > changes of this one into your previous patches, but I prefer the
> > latter one :)
> >
> > 2) If you'd like to go with a stand alone patch, could you please
> > rewrite a commit log and title for this one?
> >
> > Thanks,
> > Tao Liu
> >
> >
> > >
> > > To test various gdb passthroughs:
> > >
> > >         (crash) set
> > >         (crash) set gdb on
> > >         gdb> thread
> > >         gdb> bt
> > >         gdb> info threads
> > >         gdb> info threads
> > >         gdb> info locals
> > >         gdb> info variables irq_rover_lock
> > >         gdb> info args
> > >         gdb> thread 2
> > >         gdb> set gdb off
> > >         (crash) set
> > >         (crash) set -c 6
> > >         (crash) gdb thread
> > >         (crash) bt
> > >         (crash) gdb bt
> > >         (crash) frame
> > >         (crash) gdb up
> > >         (crash) gdb down
> > >         (crash) info locals
> > >
> > > Known Issues:
> > > =============
> > >
> > > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores
> > collected
> > >    from older kernels. This is a known issue due to register mismatch,
> > and
> > >    its fix has been merged upstream:
> > >
> > >    This can also cause some 'invalid kernel virtual address' errors
> > during gdb
> > >    unwinding the stack registers
> > >
> > > Commit:
> > https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79
> > >
> > > Fixing GDB passthroughs on other architectures
> > > ==============================================
> > >
> > > Much of the work for making gdb passthroughs like 'gdb bt', 'gdb
> > > thread', 'gdb info locals' etc. has been done by the patches introducing
> > > 'machdep->get_cpu_reg' and this series fixing some issues in that.
> > >
> > > Other architectures should be able to fix these gdb functionalities by
> > > simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'.
> > >
> > > The reasoning behind that has been explained with a diagram in commit
> > > description of patch #1
> > >
> > > I will assist with my findings/observations fixing it on ppc64 whenever
> > needed.
> > >
> > > Changelog:
> > > ==========
> > >
> > > V9:
> > > + minor change in patch #5: sync gdb context on a 'set' and 'set -p'
> > > + add taoliu's patch for using current context, and fixes in
> > ppc64_get_cpu_reg
> > >
> > > V8:
> > > + use get_active_task instead of depending on CURRENT_CONTEXT in
> > ppc64_get_cpu_reg
> > > + rebase to upstream/master (5977936c0a91)
> > >
> > > V7:
> > > + move changes in gdb-10.2.patch to the end (minor change in patch
> > #3,4,5)
> > > + fix a memory leak in ppc64_get_cpu_reg (minor change in patch #1)
> > > + use ascii diagram in patch #1 description
> > >
> > > V6:
> > > + changes in patch #5: fix bug introduced in v5 that caused initial gdb
> > thread
> > >   to be thread 1
> > >
> > > V5:
> > > + changes in patch #1: made ppc64_get_cpu_reg static, and remove
> > unreachable
> > >   code
> > > + changes in patch #3: fixed typo 'ppc64_renum' instead of
> > 'ppc64_regnum',
> > >   remove unneeded if condition
> > > + changes in patch #5: implement refresh regcache on per thread, instead
> > of all
> > >   threads at once
> > >
> > > V4:
> > > + fix segmentation fault in live debugging (change in patch #1)
> > > + mention live debugging not supported in cover letter and patch #1
> > > + fixed some checkpatch warnings (change in patch #5)
> > >
> > > V3:
> > > + default gdb thread will be the crashing thread, instead of being
> > >   thread '0'
> > > + synchronise crash cpu and gdb thread context
> > > + fix bug in gdb_interface, that replaced gdb's output stream, losing
> > >   output in some cases, such as info threads and extra output in info
> > >   variables
> > > + fix 'info threads'
> > >
> > > RFC V2:
> > >   - removed patch implementing 'frame', 'up', 'down' in crash
> > >   - updated the cover letter by removing the mention of those commands
> > other
> > >         than the respective gdb passthrough
> > >
> > > Aditya Gupta (5):
> > >   ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
> > >   remove 'frame' from prohibited commands list
> > >   synchronise cpu context changes between crash/gdb
> > >   fix gdb_interface: restore gdb's output streams at end of
> > >     gdb_interface
> > >   fix 'info threads' command
> > >
> > >  crash_target.c  |  44 ++++++++++++++++
> > >  defs.h          | 130 +++++++++++++++++++++++++++++++++++++++++++++++-
> > >  gdb-10.2.patch  | 110 +++++++++++++++++++++++++++++++++++++++-
> > >  gdb_interface.c |   2 +-
> > >  kernel.c        |  47 +++++++++++++++--
> > >  ppc64.c         |  95 +++++++++++++++++++++++++++++++++--
> > >  task.c          |  14 ++++++
> > >  tools.c         |   2 +-
> > >  8 files changed, 434 insertions(+), 10 deletions(-)
> > >
> > > --
> > > 2.41.0
> > >
> >
> >
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux