On Mon, Aug 26, 2024 at 11:55 AM <devel-request@xxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Date: Mon, 26 Aug 2024 15:52:26 +1200
From: Tao Liu <ltao@xxxxxxxxxx>
Subject: [PATCH v6 00/14] gdb stack unwinding support
for crash utility
To: devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
Cc: Tao Liu <ltao@xxxxxxxxxx>
Message-ID: <20240826035240.14781-1-ltao@xxxxxxxxxx>
Content-Type: text/plain; charset=UTF-8
This patchset is a rebase/merged version of the following 3 patchsets:
1): [PATCH v10 0/5] Improve stack unwind on ppc64 [1]
2): [PATCH 0/5] x86_64 gdb stack unwinding support [2]
3): Clean up on top of one-thread-v2 [3]
A complete description of gdb stack unwinding support for crash can be
found in [1].
This patchset can be divided into the following 2 parts:
1) part1: arch independent, mainly modify on the
crash_target.c/gdb_interface.c files, in preparation of the
gdb side.
2) part2: arch specific part, for implementing ppc64/x86_64/arm64/vmware
gdb stack unwinding support.
=== part 2
- arm64:
arm64: Add gdb stack unwinding support
- vmware:
vmware_guestdump: Various format versions support
set_context(): check if context is already current
- x86_64:
x86_64: Fix invalid input "=>" for bt command
Fix cpumask_t recursive dependence issue
x86_64: Add gdb stack unwinding support
- ppc64:
ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
=== part 1
Stop stack unwinding at non-kernel address
Fix gdb_interface: restore gdb's output streams at end of gdb_interface
Print task pid/command instead of CPU index
Rename get_cpu_reg to get_current_task_reg
Let crash change gdb context
Leave only one gdb thread for crash
Remove 'frame' from prohibited commands list
===
v6 -> v5:
1) Refactor patch 4 & 9, which changed the function signature of struct
get_cpu_reg/get_current_task_reg, and let each patch compile with no
error when added on.
2) Rebased the patchset on top of latest upstream:
("79b93ecb2e72ec Fix a "Bus error" issue caused by 'crash --osrelease' or
crash loading")
Thank you for the update, Tao.
I have a few more comments here:
[1] [PATCH v6 04/14]:
+ * Task is active, grab CPU's registers
+ */
+ if (is_task_active(tc->task) && VMSS_DUMPFILE())
+ return vmware_vmss_get_cpu_reg(tc->processor, regno, name, size, value);
+ */
+ if (is_task_active(tc->task) && VMSS_DUMPFILE())
+ return vmware_vmss_get_cpu_reg(tc->processor, regno, name, size, value);
Can you help confirm that it only works for the active task? Is this expected behavior?
[2] [PATCH v6 07/14]
#ifdef CRASH_MERGE
CORE_ADDR pc = 0;
get_frame_pc_if_available (fi, &pc);
if (!is_kvaddr(pc)) {
printf_filtered (_("Backtrace stopped due to non-kernel addr: %lx\n"),pc);
fi = NULL;
break;
}
#endif
I would suggest removing the above printf_filtered(...), otherwise it will be always displayed.
[3] [PATCH v6 08/14]
a. warning
cc -c -g -DPPC64 -m64 -DLZO -DGDB_10_2 ppc64.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security
ppc64.c: In function ‘ppc64_get_current_task_reg’:
ppc64.c:2519:15: warning: unused variable ‘task’ [-Wunused-variable]
2519 | ulong task;
| ^~~~
ppc64.c: In function ‘ppc64_get_current_task_reg’:
ppc64.c:2519:15: warning: unused variable ‘task’ [-Wunused-variable]
2519 | ulong task;
| ^~~~
b. There are different results between 'bt' and 'gdb bt' commands. Can you help double check?
crash> bt
PID: 6298 TASK: c000000050bbcb00 CPU: 2 COMMAND: "bash"
R0: c0000000002be63c R1: c00000005102b710 R2: c000000001bf8f00
R3: c00000005102b728 R4: 0000000000000000 R5: 0000000000000000
R6: c00000005102b8d8 R7: c00000000f5d0000 R8: 0000000000000000
R9: c0000000557d3c00 R10: 0000000000000001 R11: 0000000000002000
R12: c0000000027634a8 R13: c00000000f6cdf00 R14: 0000000000000000
R15: 0000000000000000 R16: 0000000000000000 R17: 0000000000000000
R18: 0000000000000000 R19: 0000000000000000 R20: 0000000000000000
R21: 0000000000000000 R22: 0000000000000000 R23: 0000000000000000
R24: 0000000000000007 R25: 0000000000000000 R26: 0000000000000000
R27: c00000000274d898 R28: c000000002d0a8d0 R29: c000000002e23df0
R30: 0000000000000000 R31: c000000002e23ddc
NIP: c0000000002bdf64 MSR: 8000000000009033 OR3: 0000000000000000
CTR: 0000000000000000 LR: c0000000002be63c XER: 0000000020040004
CCR: 0000000024222280 MQ: 0000000000000001 DAR: 0000000000000000
DSISR: 0000000000000000 Syscall Result: 0000000000000000
[NIP : crash_setup_regs+68]
[LR : __crash_kexec+156]
#0 [c00000005102b710] crash_setup_regs at c0000000002bdf64
crash> gdb bt
#0 0xc0000000002bdf64 in crash_setup_regs (newregs=0xc00000005102b728, oldregs=0x0) at ./arch/powerpc/include/asm/kexec.h:133
#1 0xc0000000002be658 in __crash_kexec (regs=0x0) at kernel/crash_core.c:122
#2 0xc00000000016c284 in panic (fmt=0xc0000000015dd018 "sysrq triggered crash\n") at kernel/panic.c:367
#3 0xc000000000a66e78 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:154
#4 0xc000000000a67994 in __handle_sysrq (key=key@entry=99 'c', check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:612
#5 0xc000000000a68454 in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1181
#6 0xc00000000072868c in pde_write (pde=0xc000000003671f80, file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:334
#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:346
#8 0xc00000000063bf20 in vfs_write (file=0xc000000009cb5b00, buf=0x100123a71a0 <error: Cannot access memory at address 0x100123a71a0>, count=2, pos=0xc00000005102bc00) at fs/read_write.c:588
#9 vfs_write (file=0xc000000009cb5b00, buf=0x100123a71a0 <error: Cannot access memory at address 0x100123a71a0>, count=<optimized out>, pos=0xc00000005102bc00) at fs/read_write.c:570
#10 0xc00000000063c4d0 in ksys_write (fd=<optimized out>, buf=0x100123a71a0 <error: Cannot access memory at address 0x100123a71a0>, count=2) at fs/read_write.c:643
#11 0xc000000000031a28 in system_call_exception (regs=0xc00000005102be80, r0=<optimized out>) at arch/powerpc/kernel/syscall.c:153
#12 0xc00000000000d05c in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:198
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
crash>
PID: 6298 TASK: c000000050bbcb00 CPU: 2 COMMAND: "bash"
R0: c0000000002be63c R1: c00000005102b710 R2: c000000001bf8f00
R3: c00000005102b728 R4: 0000000000000000 R5: 0000000000000000
R6: c00000005102b8d8 R7: c00000000f5d0000 R8: 0000000000000000
R9: c0000000557d3c00 R10: 0000000000000001 R11: 0000000000002000
R12: c0000000027634a8 R13: c00000000f6cdf00 R14: 0000000000000000
R15: 0000000000000000 R16: 0000000000000000 R17: 0000000000000000
R18: 0000000000000000 R19: 0000000000000000 R20: 0000000000000000
R21: 0000000000000000 R22: 0000000000000000 R23: 0000000000000000
R24: 0000000000000007 R25: 0000000000000000 R26: 0000000000000000
R27: c00000000274d898 R28: c000000002d0a8d0 R29: c000000002e23df0
R30: 0000000000000000 R31: c000000002e23ddc
NIP: c0000000002bdf64 MSR: 8000000000009033 OR3: 0000000000000000
CTR: 0000000000000000 LR: c0000000002be63c XER: 0000000020040004
CCR: 0000000024222280 MQ: 0000000000000001 DAR: 0000000000000000
DSISR: 0000000000000000 Syscall Result: 0000000000000000
[NIP : crash_setup_regs+68]
[LR : __crash_kexec+156]
#0 [c00000005102b710] crash_setup_regs at c0000000002bdf64
crash> gdb bt
#0 0xc0000000002bdf64 in crash_setup_regs (newregs=0xc00000005102b728, oldregs=0x0) at ./arch/powerpc/include/asm/kexec.h:133
#1 0xc0000000002be658 in __crash_kexec (regs=0x0) at kernel/crash_core.c:122
#2 0xc00000000016c284 in panic (fmt=0xc0000000015dd018 "sysrq triggered crash\n") at kernel/panic.c:367
#3 0xc000000000a66e78 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:154
#4 0xc000000000a67994 in __handle_sysrq (key=key@entry=99 'c', check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:612
#5 0xc000000000a68454 in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1181
#6 0xc00000000072868c in pde_write (pde=0xc000000003671f80, file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:334
#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:346
#8 0xc00000000063bf20 in vfs_write (file=0xc000000009cb5b00, buf=0x100123a71a0 <error: Cannot access memory at address 0x100123a71a0>, count=2, pos=0xc00000005102bc00) at fs/read_write.c:588
#9 vfs_write (file=0xc000000009cb5b00, buf=0x100123a71a0 <error: Cannot access memory at address 0x100123a71a0>, count=<optimized out>, pos=0xc00000005102bc00) at fs/read_write.c:570
#10 0xc00000000063c4d0 in ksys_write (fd=<optimized out>, buf=0x100123a71a0 <error: Cannot access memory at address 0x100123a71a0>, count=2) at fs/read_write.c:643
#11 0xc000000000031a28 in system_call_exception (regs=0xc00000005102be80, r0=<optimized out>) at arch/powerpc/kernel/syscall.c:153
#12 0xc00000000000d05c in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:198
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
crash>
[4] [PATCH v6 10/14]
diff --git a/xen_hyper.c b/xen_hyper.c
index 32e56fa..54c7f57 100644
--- a/xen_hyper.c
+++ b/xen_hyper.c
@@ -52,7 +52,7 @@ xen_hyper_init(void)
*/
xht->xen_virt_start &= 0xffffffffc0000000;
#endif
-
+ STRUCT_SIZE_INIT(cpumask_t, "cpumask_t");
if (machine_type("X86_64") &&
symbol_exists("xen_phys_start") && !xen_phys_start())
error(WARNING,
index 32e56fa..54c7f57 100644
--- a/xen_hyper.c
+++ b/xen_hyper.c
@@ -52,7 +52,7 @@ xen_hyper_init(void)
*/
xht->xen_virt_start &= 0xffffffffc0000000;
#endif
-
+ STRUCT_SIZE_INIT(cpumask_t, "cpumask_t");
if (machine_type("X86_64") &&
symbol_exists("xen_phys_start") && !xen_phys_start())
error(WARNING,
Could you please explain why the above changes are needed?
[5] [PATCH v6 14/14]
a. warning
cc -c -g -DARM64 -DLZO -DGDB_10_2 arm64.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security
arm64.c: In function ‘arm64_get_stack_frame’:
arm64.c:4138:13: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
4138 | int ret;
| ^~~
arm64.c: In function ‘arm64_get_stack_frame’:
arm64.c:4138:13: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
4138 | int ret;
| ^~~
b. indentation issue
- bt->task);
+ ur_bitmap = (struct user_regs_bitmap_struct *)GETBUF(sizeof(*ur_bitmap));
+ memset(ur_bitmap, 0, sizeof(*ur_bitmap));
+ ur_bitmap->ur.pc = stackframe.pc;
+ ur_bitmap = (struct user_regs_bitmap_struct *)GETBUF(sizeof(*ur_bitmap));
+ memset(ur_bitmap, 0, sizeof(*ur_bitmap));
+ ur_bitmap->ur.pc = stackframe.pc;
[6] The 'info threads' command only displays the current task, not all known tasks. And this looks different behavior from the gdb
gdb> info threads
Id Target Id Frame
* 1 6298 bash 0xc0000000002bdf64 in crash_setup_regs (newregs=0xc00000005102b728, oldregs=0x0) at ./arch/powerpc/include/asm/kexec.h:133
Id Target Id Frame
* 1 6298 bash 0xc0000000002bdf64 in crash_setup_regs (newregs=0xc00000005102b728, oldregs=0x0) at ./arch/powerpc/include/asm/kexec.h:133
[7] the 'thread' command can not switch to another task.
crash> ps
...
6297 6287 0 c000000057517e80 IN 0.0 22784 11776 sshd-session
> 6298 6297 2 c000000050bbcb00 RU 0.0 9664 6272 bash
> 6298 6297 2 c000000050bbcb00 RU 0.0 9664 6272 bash
...
gdb> thread 6297
gdb: gdb request failed: thread 6297
gdb>
gdb: gdb request failed: thread 6297
gdb>
Finally I have to do it with 'set' command as below:
crash> set 6297
PID: 6297
COMMAND: "sshd-session"
TASK: c000000057517e80 [THREAD_INFO: c000000057517e80]
CPU: 0
STATE: TASK_INTERRUPTIBLE
crash> set gdb on
gdb: on
gdb> bt
#0 0xc00000009cd57650 in ?? ()
gdb: gdb request failed: bt
gdb>
PID: 6297
COMMAND: "sshd-session"
TASK: c000000057517e80 [THREAD_INFO: c000000057517e80]
CPU: 0
STATE: TASK_INTERRUPTIBLE
crash> set gdb on
gdb: on
gdb> bt
#0 0xc00000009cd57650 in ?? ()
gdb: gdb request failed: bt
gdb>
In addition, the code is changed, but its patch log is not updated accordingly. Could you please double check? I won't list them here.
Thanks.
Lianbo
v5 -> v4:
1) Plenty of code refactoring based on Lianbo's comments on v4.
2) Removed the magic number when dealing with regs bitmap, see [6].
3) Rebased the patchset on top of latest upstream:
("1c6da3eaff8207 arm64: Fix bt command show wrong stacktrace on ramdump source")
v4 -> v3:
Fixed the author issue in [PATCH v3 06/16] Fix gdb_interface: restore gdb's
output streams at end of gdb_interface.
v3 -> v2:
1) Updated CC list as pointed out in [4]
2) Compiling issues as in [5]
v2 -> v1:
1) Added the patch: x86_64: Fix invalid input "=>" for bt command,
thanks for Kazu's testing.
2) Modify the patch: x86_64: Add gdb stack unwinding support, added the
pcp_save, spp_save and sp, for restoring the value in match of the original
code logic.
[1]: https://www.mail-archive.com/devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx/msg00469.html
[2]: https://www.mail-archive.com/devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx/msg00488.html
[3]: https://www.mail-archive.com/devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx/msg00554.html
[4]: https://www.mail-archive.com/devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx/msg00681.html
[5]: https://www.mail-archive.com/devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx/msg00715.html
[6]: https://www.mail-archive.com/devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx/msg00819.html
Aditya Gupta (3):
Remove 'frame' from prohibited commands list
Fix gdb_interface: restore gdb's output streams at end of
gdb_interface
ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
Alexey Makhalov (2):
set_context(): check if context is already current
vmware_guestdump: Various format versions support
Tao Liu (9):
Leave only one gdb thread for crash
Let crash change gdb context
Rename get_cpu_reg to get_current_task_reg
Print task pid/command instead of CPU index
Stop stack unwinding at non-kernel address
x86_64: Add gdb stack unwinding support
Fix cpumask_t recursive dependence issue
x86_64: Fix invalid input "=>" for bt command
arm64: Add gdb stack unwinding support
arm64.c | 115 +++++++++++++++-
crash_target.c | 71 ++++++----
defs.h | 194 ++++++++++++++++++++++++++-
gdb-10.2.patch | 82 ++++++++++++
gdb_interface.c | 35 ++---
kernel.c | 63 +++++++--
ppc64.c | 175 +++++++++++++++++++++++-
symbols.c | 15 +++
task.c | 34 +++--
tools.c | 10 +-
unwind_x86_64.h | 4 -
vmware_guestdump.c | 321 +++++++++++++++++++++++++++++++-------------
x86_64.c | 323 ++++++++++++++++++++++++++++++++++++++++-----
xen_hyper.c | 2 +-
14 files changed, 1225 insertions(+), 219 deletions(-)
--
2.40.1
-- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki