Hi, Please checkout the patch which addresses the bug: https://www.mail-archive.com/devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx/msg01128.html Thanks, Tao Liu On Fri, Sep 13, 2024 at 9:07 PM <mycomplexlove@xxxxxxxxx> wrote: > > Hello, crash main programmers. > I found a problem. On crash with gdb10.2, I have a vmcore that prints parts > that shouldn't appear when parsing the process stack. > I have had some discussions with liutgnu. I recompiled and tried based on https://github.com/liutgnu/crash-preview. > Unfortunately, it seems that the crash version based on gdb13.2 still has this problem. > Here is the output of my test: > ------------------- > crash 8.0.4++ > Copyright (C) 2002-2022 Red Hat, Inc. > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > Copyright (C) 1999-2006 Hewlett-Packard Co > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > Copyright (C) 2005, 2011, 2020-2022 NEC Corporation > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > Copyright (C) 2015, 2021 VMware, Inc. > This program is free software, covered by the GNU General Public License, > and you are welcome to change it and/or distribute copies of it under > certain conditions. Enter "help copying" to see the conditions. > This program has absolutely no warranty. Enter "help warranty" for details. > > GNU gdb (GDB) 13.2 > Copyright (C) 2023 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > Type "show copying" and "show warranty" for details. > This GDB was configured as "x86_64-pc-linux-gnu". > Type "show configuration" for configuration details. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > > For help, type "help". > Type "apropos word" to search for commands related to "word"... > > KERNEL: /root/hungtask/vmlinux [TAINTED] > DUMPFILE: /root/hungtask/2024_09_06_05_02_15.kernel_core [PARTIAL DUMP] > CPUS: 64 > DATE: Fri Sep 6 05:01:47 CST 2024 > UPTIME: 12:27:05 > LOAD AVERAGE: 56.87, 25.40, 18.24 > TASKS: 4319 > NODENAME: host-047bcb37834d > RELEASE: 4.19.90-89.11.v2401.osc.sfc.6.11.0.0070.ky10.x86_64+debug > VERSION: #1 SMP Fri Aug 30 08:21:33 UTC 2024 > MACHINE: x86_64 (2499 Mhz) > MEMORY: 255.9 GB > PANIC: "Kernel panic - not syncing: softlockup: hung tasks" > PID: 112450 > COMMAND: "vtpstatd" > TASK: ffff88816ae80000 [THREAD_INFO: ffff88816ae80000] > CPU: 41 > STATE: TASK_RUNNING (PANIC) > > crash> bt > PID: 112450 TASK: ffff88816ae80000 CPU: 41 COMMAND: "vtpstatd" > #0 [ffff889e3fa87af8] machine_kexec at ffffffff92d059ab > #1 [ffff889e3fa87c18] __crash_kexec at ffffffff92fb9a99 > #2 [ffff889e3fa87d30] panic at ffffffff9483ed43 > #3 [ffff889e3fa87df8] watchdog_timer_fn at ffffffff93052cf6 > #4 [ffff889e3fa87e30] __hrtimer_run_queues at ffffffff92f5e96e > #5 [ffff889e3fa87f28] hrtimer_interrupt at ffffffff92f5ffe7 > #6 [ffff889e3fa87fc8] smp_apic_timer_interrupt at ffffffff94a03176 > #7 [ffff889e3fa87ff0] apic_timer_interrupt at ffffffff94a0192f > --- <IRQ stack> --- > #8 [ffff888263157938] apic_timer_interrupt at ffffffff94a0192f > [exception RIP: copy_page_range+3681] > RIP: ffffffff9331d461 RSP: ffff8882631579e8 RFLAGS: 00000246 > RAX: 1ffffd4018a95ad1 RBX: 8000003152b5a805 RCX: ffffea00c54ad688 > RDX: ffffea00c54aee88 RSI: 00007f80d117f000 RDI: ffffffff956468e0 > RBP: ffff8881c5bc2bf8 R8: fffff94018a2e22f R9: fffff94018a2e22f > R10: 0000000000000001 R11: fffff94018a2e22e R12: 0000000000000018 > R13: dffffc0000000000 R14: ffffea00c54ad680 R15: 00007f80d117f000 > ORIG_RAX: ffffffffffffff13 CS: 0010 SS: 0018 > #9 [ffff888263157bc0] copy_process at ffffffff92dadcbd > #10 [ffff888263157d20] __mutex_init at ffffffff92ed8dd5 > #11 [ffff888263157d38] __alloc_file at ffffffff93458397 > #12 [ffff888263157d60] alloc_empty_file at ffffffff934585d2 > #13 [ffff888263157da8] __alloc_fd at ffffffff934b5ead > #14 [ffff888263157e38] _do_fork at ffffffff92dae7a1 > #15 [ffff888263157f28] do_syscall_64 at ffffffff92c085f4 > #16 [ffff888263157f50] entry_SYSCALL_64_after_hwframe at ffffffff94a000a4 > RIP: 00007f80ec93641a RSP: 00007ffcb38bbd50 RFLAGS: 00000246 > RAX: ffffffffffffffda RBX: 00007ffcb38bbd50 RCX: 00007f80ec93641a > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011 > RBP: 00007ffcb38bbde0 R8: 000000000001b742 R9: 00007f80ee1a0f80 > R10: 00007f80ee1a1250 R11: 0000000000000246 R12: 000000000001b742 > R13: 00007ffcb38bbd70 R14: 0000000000000000 R15: 00007ffcb38bbf00 > ORIG_RAX: 0000000000000038 CS: 0033 SS: 002b > ------------------- > #10....#13 They seem redundant. > > The following is the analysis output based on gdb7.6 and the latest crash code: > ------------------- > crash_805_gdb76 8.0.5++ > Copyright (C) 2002-2024 Red Hat, Inc. > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > Copyright (C) 1999-2006 Hewlett-Packard Co > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > Copyright (C) 2005, 2011, 2020-2024 NEC Corporation > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > This program is free software, covered by the GNU General Public License, > and you are welcome to change it and/or distribute copies of it under > certain conditions. Enter "help copying" to see the conditions. > This program has absolutely no warranty. Enter "help warranty" for details. > > GNU gdb (GDB) 7.6 > Copyright (C) 2013 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-unknown-linux-gnu"... > > WARNING: kernel relocated [284MB]: patching 99408 gdb minimal_symbol values > > crash_805_gdb76: gdb cannot find text block for address: dd_init_queue > KERNEL: vmlinux [TAINTED] > DUMPFILE: 2024_09_06_05_02_15.kernel_core [PARTIAL DUMP] > CPUS: 64 > DATE: Fri Sep 6 05:01:47 CST 2024 > UPTIME: 12:27:05 > LOAD AVERAGE: 56.87, 25.40, 18.24 > TASKS: 4319 > NODENAME: host-047bcb37834d > RELEASE: 4.19.90-89.11.v2401.osc.sfc.6.11.0.0070.ky10.x86_64+debug > VERSION: #1 SMP Fri Aug 30 08:21:33 UTC 2024 > MACHINE: x86_64 (2499 Mhz) > MEMORY: 255.9 GB > PANIC: "Kernel panic - not syncing: softlockup: hung tasks" > PID: 112450 > COMMAND: "vtpstatd" > TASK: ffff88816ae80000 [THREAD_INFO: ffff88816ae80000] > CPU: 41 > STATE: TASK_RUNNING (PANIC) > > crash_805_gdb76> bt > PID: 112450 TASK: ffff88816ae80000 CPU: 41 COMMAND: "vtpstatd" > #0 [ffff889e3fa87af8] machine_kexec at ffffffff92d059ab > #1 [ffff889e3fa87c18] __crash_kexec at ffffffff92fb9a99 > #2 [ffff889e3fa87d30] panic at ffffffff9483ed43 > #3 [ffff889e3fa87df8] watchdog_timer_fn at ffffffff93052cf6 > #4 [ffff889e3fa87e30] __hrtimer_run_queues at ffffffff92f5e96e > #5 [ffff889e3fa87f28] hrtimer_interrupt at ffffffff92f5ffe7 > #6 [ffff889e3fa87fc8] smp_apic_timer_interrupt at ffffffff94a03176 > #7 [ffff889e3fa87ff0] apic_timer_interrupt at ffffffff94a0192f > --- <IRQ stack> --- > #8 [ffff888263157938] apic_timer_interrupt at ffffffff94a0192f > [exception RIP: copy_page_range+3681] > RIP: ffffffff9331d461 RSP: ffff8882631579e8 RFLAGS: 00000246 > RAX: 1ffffd4018a95ad1 RBX: 8000003152b5a805 RCX: ffffea00c54ad688 > RDX: ffffea00c54aee88 RSI: 00007f80d117f000 RDI: ffffffff956468e0 > RBP: ffff8881c5bc2bf8 R8: fffff94018a2e22f R9: fffff94018a2e22f > R10: 0000000000000001 R11: fffff94018a2e22e R12: 0000000000000018 > R13: dffffc0000000000 R14: ffffea00c54ad680 R15: 00007f80d117f000 > ORIG_RAX: ffffffffffffff13 CS: 0010 SS: 0018 > #9 [ffff888263157bc0] copy_process at ffffffff92dadcbd > #10 [ffff888263157e38] _do_fork at ffffffff92dae7a1 > #11 [ffff888263157f28] do_syscall_64 at ffffffff92c085f4 > #12 [ffff888263157f50] entry_SYSCALL_64_after_hwframe at ffffffff94a000a4 > RIP: 00007f80ec93641a RSP: 00007ffcb38bbd50 RFLAGS: 00000246 > RAX: ffffffffffffffda RBX: 00007ffcb38bbd50 RCX: 00007f80ec93641a > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011 > RBP: 00007ffcb38bbde0 R8: 000000000001b742 R9: 00007f80ee1a0f80 > R10: 00007f80ee1a1250 R11: 0000000000000246 R12: 000000000001b742 > R13: 00007ffcb38bbd70 R14: 0000000000000000 R15: 00007ffcb38bbf00 > ORIG_RAX: 0000000000000038 CS: 0033 SS: 002b > ------------------- > It seems that gdb7.6 parsing is more convincing. This version is compiled by reverting the commit of update gdb > (github url: https://github.com/crash-utility/crash/commit/9fab193). > I also tried the release versions of crash 7.3.2 and 8.0.1 (I had problems compiling 8.0.0), > and the results are consistent with the above. 7.3.2 parsing is normal, and 8.0.1 has the problem. > > > In crash_805_gdb76 x86_64_framesize_cache[3].framesize=624 : > (gdb) p x86_64_framesize_cache[0] > $136 = {textaddr = 18446744071880546969, framesize = 272, exception = 0} > (gdb) p x86_64_framesize_cache[1] > $137 = {textaddr = 18446744071906258243, framesize = 192, exception = 0} > (gdb) p x86_64_framesize_cache[2] > $138 = {textaddr = 18446744071908104495, framesize = 8, exception = 0} > (gdb) p x86_64_framesize_cache[3] > $139 = {textaddr = 18446744071878401213, framesize = 624, exception = 0} > > but In crash_805_gdb102 x86_64_framesize_cache[3].framesize=0 : > (gdb) p x86_64_framesize_cache[0] > $86 = {textaddr = 18446744071880546969, framesize = 272, exception = 0} > (gdb) p x86_64_framesize_cache[1] > $87 = {textaddr = 18446744071906258243, framesize = 192, exception = 0} > (gdb) p x86_64_framesize_cache[2] > $88 = {textaddr = 18446744071908104495, framesize = 8, exception = 0} > (gdb) p x86_64_framesize_cache[3] > $89 = {textaddr = 18446744071878401213, framesize = 0, exception = 0} > > --------------------------------------------- > After [Walk the process stack. ] of x86_64_low_budget_back_trace_cmd, the value of *up is as follows: > x86_64.c:4059 switch (x86_64_print_stack_entry(bt, ofp, level, i,*up)) > > The address returned by crash_805_gdb76: > 0xffffffff92d059ab > 0xffffffff92fb9a99 > 0xffffffff9483ed43 > 0xffffffff93052cf6 > 0xffffffff92f5e96e > 0xffffffff92f5ffe7 > 0xffffffff94a03176 > 0xffffffff94a0192f > 0xffffffff92dadcbd <-copy_page_range > 0xffffffff92dae7a1 > 0xffffffff92c085f4 > 0xffffffff94a000a4 > > The address returned by crash_805_gdb102: > 0xffffffff92d059ab > 0xffffffff92fb9a99 > 0xffffffff9483ed43 > 0xffffffff93052cf6 > 0xffffffff92f5e96e > 0xffffffff92f5ffe7 > 0xffffffff94a03176 > 0xffffffff94a0192f > 0xffffffff92dadcbd <-copy_page_range > 0xffffffff92ed8dd5 -------Parts that shouldn't appear > 0xffffffff93458397 > 0xffffffff934585d2 > 0xffffffff934b5ead --------Parts that shouldn't appear > 0xffffffff92dae7a1 > 0xffffffff92c085f4 > 0xffffffff94a000a4 > > Analyze its symbols: > 0xffffffff92ed8dd5 __mutex_init+181 > 0xffffffff93458397 __alloc_file+407 > 0xffffffff934585d2 alloc_empty_file+146 > 0xffffffff934b5ead __alloc_fd+141 > > Generate vmcore parameters: > makedumpfile -l -d 31 /proc/vmcore [date].kernel_core > > Unfortunately, I am not using a regular distribution, it is a deeply customized one > > vmcore google drive url: > https://drive.google.com/file/d/1pDICRP6zQafe00c4LWRV-SklkM75971P/view > -- > Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ > Contribution Guidelines: https://github.com/crash-utility/crash/wiki -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki