On Fri, Nov 20, 2015 at 03:18:55PM -0500, Dave Anderson wrote: > > > ----- Original Message ----- > > > > > > ----- Original Message ----- > > > QEMU can generate both non-makedumpfile (just elf) and makedumpfile > > > formatted kdumps. In neither case will crash_notes have prstatus, as > > > crash_kexec doesn't run in the kernel, however the elf notes will > > > contain the prstatus, and we can dig them out of there. > > > > I don't have a lot of ARM and ARM64 dumpfiles, but just doing a > > quick sanity test of your patch, I came across this ARM dumpfile, > > which I believe may be a QEMU-generated ELF vmcore. I'm not sure, > > but it only has 1 NT_PRSTATUS note for the 1 online cpu (of 5 cpus). If it's more than a day old then it won't be from qemu. I just posted the patches for that yesterday morning :-) > > > > But anyway, note that as expected, it cannot find the registers in the > > kernel's uninitialized crash_notes -- here without your patch: So it looks like there may be other dump types (besides qemu generated) that can result in missing crash_notes. I don't know what those are, other than just corruption? In any case, I think this dump is still a good test case for the reason you found below. > > > > $ crash vmcore.pae vmlinux.pae.gz > > > > crash 7.1.4rc15 > > Copyright (C) 2002-2014 Red Hat, Inc. > > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > > Copyright (C) 1999-2006 Hewlett-Packard Co > > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > > Copyright (C) 2005, 2011 NEC Corporation > > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > > This program is free software, covered by the GNU General Public License, > > and you are welcome to change it and/or distribute copies of it under > > certain conditions. Enter "help copying" to see the conditions. > > This program has absolutely no warranty. Enter "help warranty" for > > details. > > > > GNU gdb (GDB) 7.6 > > Copyright (C) 2013 Free Software Foundation, Inc. > > License GPLv3+: GNU GPL version 3 or later > > <http://gnu.org/licenses/gpl.html> > > This is free software: you are free to change and redistribute it. > > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > > and "show warranty" for details. > > This GDB was configured as "--host=x86_64-unknown-linux-gnu > > --target=arm-elf-linux"... > > > > WARNING: invalid note (n_type != NT_PRSTATUS) > > WARNING: cannot retrieve registers for active tasks > > > > KERNEL: vmlinux.pae.gz > > DUMPFILE: vmcore.pae > > CPUS: 5 [OFFLINE: 4] > > DATE: Sun Jun 8 18:27:39 2014 > > UPTIME: 00:03:22 > > LOAD AVERAGE: 0.16, 0.16, 0.07 > > TASKS: 51 > > NODENAME: buildroot > > RELEASE: 3.13.5 > > VERSION: #3 SMP Mon Jun 9 05:58:39 CST 2014 > > MACHINE: armv7l (unknown Mhz) > > MEMORY: 256 MB > > PANIC: "SysRq : Trigger a crash" > > PID: 732 > > COMMAND: "sh" > > TASK: 8bcead00 [THREAD_INFO: 8ad32000] > > CPU: 0 > > STATE: TASK_RUNNING (SYSRQ) > > > > crash> bt -a > > PID: 732 TASK: 8bcead00 CPU: 0 COMMAND: "sh" > > bt: WARNING: cannot determine starting stack frame for task 8bcead00 > > > > PID: 0 TASK: 8bc561c0 CPU: 1 COMMAND: "swapper/1" > > bt: WARNING: cannot determine starting stack frame for task 8bc561c0 > > > > PID: 0 TASK: 8bc56580 CPU: 2 COMMAND: "swapper/2" > > bt: WARNING: cannot determine starting stack frame for task 8bc56580 > > > > PID: 0 TASK: 8bc56940 CPU: 3 COMMAND: "swapper/3" > > bt: WARNING: cannot determine starting stack frame for task 8bc56940 > > > > PID: 0 TASK: 8bc56d00 CPU: 4 COMMAND: "swapper/4" > > bt: WARNING: cannot determine starting stack frame for task 8bc56d00 > > crash> > > > > > > With your patch applied, it generates a SIGSEGV in arm_get_crash_notes(): > > > > $ ./crash /usr/dumps/ARM/vmcore.pae /usr/dumps/ARM/vmlinux.pae.gz > > > > crash 7.1.4rc15 > > Copyright (C) 2002-2014 Red Hat, Inc. > > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > > Copyright (C) 1999-2006 Hewlett-Packard Co > > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > > Copyright (C) 2005, 2011 NEC Corporation > > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > > This program is free software, covered by the GNU General Public License, > > and you are welcome to change it and/or distribute copies of it under > > certain conditions. Enter "help copying" to see the conditions. > > This program has absolutely no warranty. Enter "help warranty" for > > details. > > > > GNU gdb (GDB) 7.6 > > Copyright (C) 2013 Free Software Foundation, Inc. > > License GPLv3+: GNU GPL version 3 or later > > <http://gnu.org/licenses/gpl.html> > > This is free software: you are free to change and redistribute it. > > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > > and "show warranty" for details. > > This GDB was configured as "--host=x86_64-unknown-linux-gnu > > --target=arm-elf-linux"... > > > > Segmentation fault (core dumped) > > $ > > > > I haven't debugged it other than determining that the "note" it looks to > > have found the single note OK, but then upon continuation the next time > > through the loop, the "note" pointer is valid at line 597, but your > > function sets it back to NULL, and therefore it craps out at line 622: > > > > 597 note = (Elf32_Nhdr *)buf; > > 598 p = buf + sizeof(Elf32_Nhdr); > > 599 > > 600 /* > > 601 * dumpfiles created with qemu won't have > > crash_notes, but there will > > 602 * be elf notes. > > 603 */ > > 604 if (note->n_namesz == 0 && (DISKDUMP_DUMPFILE() || > > KDUMP_DUMPFILE())) { > > 605 if (DISKDUMP_DUMPFILE()) > > 606 note = > > diskdump_get_prstatus_percpu(i); > > 607 else if (KDUMP_DUMPFILE()) > > 608 note = > > netdump_get_prstatus_percpu(i); > > 609 if (note) { > > 610 /* > > 611 * SIZE(note_buf) accounts for a > > "final note", which is a > > 612 * trailing empty elf note header. > > 613 */ > > 614 long notesz = SIZE(note_buf) - > > sizeof(Elf32_Nhdr); > > 615 > > 616 if (sizeof(Elf32_Nhdr) + > > roundup(note->n_namesz, 4) + > > 617 note->n_descsz == notesz) > > 618 BCOPY((char *)note, buf, > > notesz); > > 619 } > > 620 } > > 621 > > 622 if (note->n_type != NT_PRSTATUS) { > > 623 error(WARNING, "invalid note (n_type != > > NT_PRSTATUS)\n"); > > 624 goto fail; > > 625 } > > > > Not sure how you want to handle that, probably just bail out the same way > > if note becomes NULL? > > If I add this to arm_get_crash_notes(), just after your new function: > > if (!note) { > error(WARNING, "cannot find NT_PRSTATUS note for cpu: %d\n", i); > continue; > } > > I get this: > > $ ./crash /usr/dumps/ARM/vmcore.pae* /usr/dumps/ARM/vmlinux.pae.gz > > crash 7.1.4rc15 > Copyright (C) 2002-2014 Red Hat, Inc. > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > Copyright (C) 1999-2006 Hewlett-Packard Co > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > Copyright (C) 2005, 2011 NEC Corporation > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > This program is free software, covered by the GNU General Public License, > and you are welcome to change it and/or distribute copies of it under > certain conditions. Enter "help copying" to see the conditions. > This program has absolutely no warranty. Enter "help warranty" for details. > > GNU gdb (GDB) 7.6 > Copyright (C) 2013 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=arm-elf-linux"... > > WARNING: cannot find NT_PRSTATUS note for cpu: 1 > WARNING: cannot find NT_PRSTATUS note for cpu: 2 > WARNING: cannot find NT_PRSTATUS note for cpu: 3 > WARNING: cannot find NT_PRSTATUS note for cpu: 4 > KERNEL: /usr/dumps/ARM/vmlinux.pae.gz > DUMPFILE: /usr/dumps/ARM/vmcore.pae > CPUS: 5 [OFFLINE: 4] > DATE: Sun Jun 8 18:27:39 2014 > UPTIME: 00:03:22 > LOAD AVERAGE: 0.16, 0.16, 0.07 > TASKS: 51 > NODENAME: buildroot > RELEASE: 3.13.5 > VERSION: #3 SMP Mon Jun 9 05:58:39 CST 2014 > MACHINE: armv7l (unknown Mhz) > MEMORY: 256 MB > PANIC: "SysRq : Trigger a crash" > PID: 732 > COMMAND: "sh" > TASK: 8bcead00 [THREAD_INFO: 8ad32000] > CPU: 0 > STATE: TASK_RUNNING (SYSRQ) > > crash> bt -a > PID: 732 TASK: 8bcead00 CPU: 0 COMMAND: "sh" > #0 [<80265064>] (sysrq_handle_crash) from [<80265810>] > #1 [<80265810>] (__handle_sysrq) from [<80265928>] > #2 [<80265928>] (write_sysrq_trigger) from [<80112120>] > #3 [<80112120>] (proc_reg_write) from [<800c9840>] > #4 [<800c9840>] (vfs_write) from [<800c9be4>] > #5 [<800c9be4>] (sys_write) from [<8000e3e0>] > pc : [<76e9cfdc>] lr : [<0000f998>] psr: 600d0010 > sp : 7eab862c ip : 00000000 fp : 000a82a4 > r10: 00000020 r9 : 000a8294 r8 : 00000001 > r7 : 00000004 r6 : 000a9bf0 r5 : 00000001 r4 : 000a7d88 > r3 : 00000000 r2 : 00000002 r1 : 000a9bf0 r0 : 00000001 > Flags: nZCv IRQs on FIQs on Mode USER_32 ISA ARM > > PID: 0 TASK: 8bc561c0 CPU: 1 COMMAND: "swapper/1" > bt: WARNING: cannot determine starting stack frame for task 8bc561c0 > > PID: 0 TASK: 8bc56580 CPU: 2 COMMAND: "swapper/2" > bt: WARNING: cannot determine starting stack frame for task 8bc56580 > > PID: 0 TASK: 8bc56940 CPU: 3 COMMAND: "swapper/3" > bt: WARNING: cannot determine starting stack frame for task 8bc56940 > > PID: 0 TASK: 8bc56d00 CPU: 4 COMMAND: "swapper/4" > bt: WARNING: cannot determine starting stack frame for task 8bc56d00 > crash> > > Note that if I did "goto fail" instead of "continue", I lose the good > cpu 0 backtrace from the NT_PRSTATUS that your patch found, so doing it > this way is the best of both worlds. I agree that the 'if (!note) continue' that you added is a good idea to try and salvage this type of dump. It shouldn't happen with qemu generated dumps, but anything's possible when a kernel panics... Would you like me to spin a v4 with this condition added? Or, since it actually seems to be addressing a non-qemu-generated dump issue, then maybe you just want to submit it as a new patch on top of the qemu patch? Thanks, drew -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility