On 12/13/23 22:45, Aditya Gupta wrote:
Hi Lianbo, On Wed, Dec 13, 2023 at 03:20:40PM +0800, Lianbo Jiang wrote:Hi, Aditya Thank you for the v3. I got a core dump after applying the patch[5], but I did not see the backtrace from the core dump file. Could you please check it again? # ./crash crash 8.0.4++ Copyright (C) 2002-2022 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011, 2020-2022 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. Copyright (C) 2015, 2021 VMware, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb (GDB) 10.2 Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "powerpc64le-unknown-linux-gnu". Type "show configuration" for configuration details. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Segmentation fault (core dumped)Thanks for trying it. Was it 'crash-utility' which itself crashed due to the patches ?
Seems 'yes'. I did not see the same issue before applying the patchset.
That is wierd, it should not cause any segmentation fault with the patches, can you please share the steps to reproduce this ? I will fix it.
Sorry, I should describe the failure more details. It's easy to be reproduced on my side.
Step 1: applying these five patches Step 2: make lzoStep 3-1: for live debugging, crash tool will fail to load and get the segfault.
FYI:If the current feature doesn't support for live debugging, need to mention that somewhere, at least this should not fail to load.
Step 3-2: for kdump case(vmcore) crash> gdb bt#0 0xc000000000281298 in crash_setup_regs (gdb: invalid kernel virtual address: fffffffffffffffb type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff7 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff3 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffffb type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff7 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff3 type: "gdb_readmem callback" oldregs=<optimized out>, newregs=0xc00000000c0f7908) at ./arch/powerpc/include/asm/kexec.h:69
#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:975 #2 0xfffffffffffffffb in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) crash> set gdb on gdb: on gdb> bt#0 0xc000000000281298 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000000c0f7908) at ./arch/powerpc/include/asm/kexec.h:69
#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:975 #2 0xfffffffffffffffb in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) gdb> info threads Id Target Id Frame1 CPU 0 plpar_hcall_norets_notrace () at arch/powerpc/platforms/pseries/hvCall.S:112 2 CPU 1 plpar_hcall_norets_notrace () at arch/powerpc/platforms/pseries/hvCall.S:112 3 CPU 2 plpar_hcall_norets_notrace () at arch/powerpc/platforms/pseries/hvCall.S:112 4 CPU 3 plpar_hcall_norets_notrace () at arch/powerpc/platforms/pseries/hvCall.S:112 5 CPU 4 plpar_hcall_norets_notrace () at arch/powerpc/platforms/pseries/hvCall.S:112 6 CPU 5 plpar_hcall_norets_notrace () at arch/powerpc/platforms/pseries/hvCall.S:112 7 CPU 6 plpar_hcall_norets_notrace () at arch/powerpc/platforms/pseries/hvCall.S:112 * 8 CPU 7 0xc000000000281298 in crash_setup_regs (gdb: invalid kernel virtual address: fffffffffffffffb type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff7 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff3 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffffb type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff7 type: "gdb_readmem callback" gdb: invalid kernel virtual address: fffffffffffffff3 type: "gdb_readmem callback" oldregs=<optimized out>, newregs=0xc00000000c0f7908) at ./arch/powerpc/include/asm/kexec.h:69
gdb> info locals No locals. gdb> info args oldregs = <optimized out> newregs = 0xc00000000c0f7908 gdb> Thanks. Lianbo
# gdb /tmp/core.126506 GNU gdb (GDB) Red Hat Enterprise Linux 10.2-12 Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "ppc64le-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... "0x7fffe9bd9b38s": not in executable format: file format not recognized (gdb) file crash Reading symbols from crash... (gdb) bt No stack.The patches are not intended to apply to gdb as such, but to provide the feature to have backtrace in gdb mode inside crash-utility. But the message by gdb seems to say it couldn't read the dump file:"0x7fffe9bd9b38s": not in executable format: file format not recognizedI will try to cause a crash with upstream kernel and see if anything breaks. Will let you know. Thanks, Aditya GuptaThanks. Lianbo On 12/4/23 22:59, Aditya Gupta wrote:The Problem: ============ Currently crash is unable to show function arguments and local variables, as gdb can do. And functionality for moving between frames ('up'/'down') is not working in crash. Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to gdb not getting the register values from `crash_target::fetch_registers`, which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64 Proposed Solution: ================== Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64. This way, "gdb mode in crash" will support this feature for both ELF and kdump-compressed vmcore formats, while "gdb" would only have supported ELF format This way other features of 'gdb', such as seeing backtraces/registers/variables/arguments/local variables, moving up and down stack frames, can be used with any ppc64 vmcore, irrespective of being ELF format or kdump-compressed format. Implications on Architectures: ==================================== No architecture other than PPC64 has been affected, other than in case of 'frame' command As mentioned in patch #2, since frame will not be prohibited, so it will print: crash> frame #0 <unavailable> in ?? () Instead of before prohibited message: crash> frame crash: prohibited gdb command: frame Major change will be in 'gdb mode' on PPC64, that it will print the frames, and local variables, instead of failing with errors showing no frame, or showing that couldn't get PC, it will be able to give all this information. Testing: ======== Git tree with this patch series applied: https://github.com/adi-g15-ibm/crash/tree/stack-unwind-3 To test various gdb passthroughs: gdb> set gdb> set gdb on gdb> thread gdb> bt gdb> info threads gdb> info threads gdb> info locals gdb> info variables irq_rover_lock gdb> info args gdb> thread 2 gdb> set gdb off gdb> set gdb> set -c 6 gdb> gdb thread gdb> bt gdb> gdb bt gdb> frame gdb> up gdb> down gdb> info locals Known Issues: ============= 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected from older kernels. This is a known issue due to register mismatch, and its fix has been merged upstream: Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79 Fixing GDB passthroughs on other architectures ============================================== Much of the work for making gdb passthroughs like 'gdb bt', 'gdb thread', 'gdb info locals' etc. has been done by the patches introducing 'machdep->get_cpu_reg' and this series fixing some issues in that. Other architectures should be able to fix these gdb functionalities by simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'. The reasoning behind that has been explained with a diagram in commit description of patch #1 I will assist with my findings/observations fixing it on ppc64 whenever needed. Additional Notes: ================= Sorry, it took a long time to send this version. Tried fixing 'info threads' but wasn't able to. Gave it time again, and was able to fix it this time after multiple days of debugging. Some other things from last version review: * 'info rv' not working: It's not supported in gdb, instead we need to use 'info locals rv' or 'info variables rv' * 'info variables' command hangs... and prints nothing after hanging for long It likely hangs due to a lot of symbols being there, and it's trying to get all gdb's output and page it, so Control+C messes it up, but if we pass a regex filter to limit the output, eg. info variables rq, then it doesn't hang, and prints the variables/symbols. Even with gdb, ie. simply running 'gdb vmlinux vmcore' also hangs due to the lot of symbols * making crashing thread as default in gdb: This is implemented now, along with synchronising crash & gdb contexts, in patch #3 * 'info threads' not working: This turned to be due to a bug in gdb_interface. I fixed 'info threads' in 2 patches, to simplify it, first for the gdb_interface, and another patch for setting the context correctly in crash * other info commands: I tested all the info commands, in crash along with this patch. Most of those that fail in crash are due to gdb itself not supporting them with vmcores, and other than that is the 'info pretty' command, which might not be needed in crash anyways * live debugging showing only one thread: I tried it with crash, crash shows only the current thread, ie. itself, so it does not have information of registers for the other CPUs. Similarly gdb does not support live kernel debugging (without connecting to a gdbstub/QEMU etc.). If you need I can make it show the current thread id correctly for the one thread, but I don't think it might help much with live debugging Hope, I set the context, thanks for the reviews, I replied and worked on your suggestions, but got stuck there due to 'info threads' Changelog: ========== V3: + default gdb thread will be the crashing thread, instead of being thread '0' + synchronise crash cpu and gdb thread context + fix bug in gdb_interface, that replaced gdb's output stream, losing output in some cases, such as info threads and extra output in info variables + fix 'info threads' RFC V2: - removed patch implementing 'frame', 'up', 'down' in crash - updated the cover letter by removing the mention of those commands other than the respective gdb passthrough Aditya Gupta (5): ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg remove 'frame' from prohibited commands list synchronise cpu context changes between crash/gdb fix gdb_interface: restore gdb's output streams at end of gdb_interface fix 'info threads' command crash_target.c | 44 ++++++++++++++++ defs.h | 130 +++++++++++++++++++++++++++++++++++++++++++++++- gdb-10.2.patch | 110 +++++++++++++++++++++++++++++++++++++++- gdb_interface.c | 2 +- kernel.c | 47 +++++++++++++++-- ppc64.c | 95 +++++++++++++++++++++++++++++++++-- task.c | 14 ++++++ tools.c | 2 +- 8 files changed, 434 insertions(+), 10 deletions(-)
-- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki