Download from: http://people.redhat.com/anderson or https://github.com/crash-utility/crash/releases The github master branch serves as a development branch that will contain all patches that are queued for the next release: $ git clone git://github.com/crash-utility/crash.git Changelog: - Introduction of support for "live" ramdump files, such as those that are specified by the QEMU mem-path argument of a memory-backend-file object. This allows the running of a live crash session against a QEMU guest from the host machine. In this example, the /tmp/MEM file on a QEMU host represents the guest's physical memory: $ qemu-kvm ...other-options... \ -object memory-backend-file,id=MEM,size=128m,mem-path=/tmp/MEM,share=on \ -numa node,memdev=MEM -m 128 and a live session run can be run against the guest kernel like so: $ crash <path-to-guest-vmlinux> live:/tmp/MEM@0 By prepending the ramdump image name with "live:", the crash session will act as if it were running a normal live session. (oleg@xxxxxxxxxx) - Fix for the support of ELF vmcores created by the KVM "virsh dump --memory-only" facility if the guest kernel was not configured with CONFIG_KEXEC, or CONFIG_KEXEC_CORE in Linux 4.3 and later kernels. Without the patch, the crash session fails during initialization with the message "crash: cannot resolve kexec_crash_image". (hirofumi@xxxxxxxxxxxxxxxxxx) - Added support for x86_64 ramdump files. Without the patch, the crash session fails immediately with the message "ramdump: unsupported machine type: X86_64". (anderson@xxxxxxxxxx) - Fix for a "[-Werror=misleading-indentation]" compiler warning that is generated by gdb-7.6/bfd/elf64-s390.c when building S390X in a Fedora Rawhide environment with gcc-6.0.0 (anderson@xxxxxxxxxx) - Recognize and parse the new QEMU_VM_CONFIGURATION and QEMU_VM_FOOTER sections used for live migration of KVM guests, which are seen in the "kvmdump" format generated if "virsh dump" is used without the "--memory-only" option. (pagupta@xxxxxxxxxx) - Fix for Linux commit edf14cdbf9a0e5ab52698ca66d07a76ade0d5c46, which has appended a NULL entry as the final member of the pageflag_names[] array. Without the patch, a message that indicates "crash: failed to read pageflag_names entry" is displayed during session initialization in Linux 4.6 kernels. (andrej.skvortzov@xxxxxxxxx) - Fix for Linux commit 0139aa7b7fa12ceef095d99dc36606a5b10ab83a, which renamed the page._count member to page._refcount. Without the patch, certain "kmem" commands fail with the "kmem: invalid structure member offset: page_count". (anderson@xxxxxxxxxx) - Fix for an ARM64 crash-7.1.5 "bt" regression for a task that has called panic(). Without the patch, the backtrace may fail with a message such as "bt: WARNING: corrupt prstatus? pstate=0x20000000, but no user frame found" followed by "bt: WARNING: cannot determine starting stack frame for task <address>". The pstate register warning will still be displayed (as it is essentially a kdump bug), but the backtrace will proceed normally. (anderson@xxxxxxxxxx) - Fix for the ARM64 "bt" command in Linux 4.5 and later kernels which use per-cpu IRQ stacks. Without the patch, if an active non-crashing task was running in user space when it received the shutdown IPI from the crashing task, the "-- <IRQ stack> ---" transition marker from the IRQ stack to the process stack is not displayed, and a message indicating "bt: WARNING: arm64_unwind_frame: on IRQ stack: oriq_sp: <address> fp: 0 (?)" gets displayed. (anderson@xxxxxxxxxx) - Fix for the ARM64 "bt" command in Linux 4.5 and later kernels which are not configured with CONFIG_FUNCTION_GRAPH_TRACER. Without the patch, backtraces that originate from a per-cpu IRQ stack will dump an invalid exception frame before transitioning to the process stack. (anderson@xxxxxxxxxx) - Introduction of ARM64 support for 4K pages with 4-level page tables and 48 VA bits. (takahiro.akashi@xxxxxxxxxx) - Implemented support for the redesigned ARM64 kernel virtual memory layout and associated KASLR support that was introduced in Linux 4.6. The kernel text and static data has been moved from unity-mapped memory into the vmalloc region, and its start address can be randomized if CONFIG_RANDOMIZE_BASE is configured. Related support is being put into the kernel's kdump code, the kexec-tools package, and makedumpfile(8); with that in place, the analysis of Linux 4.6 ARM64 dumpfiles with or without KASLR enabled should work normally by entering "crash vmlinux vmcore". On live systems, Linux 4.6 ARM64 kernels will only work automatically if CONFIG_RANDOMIZE_BASE is not configured. Unfortunately, if CONFIG_RANDOMIZE_BASE is configured on a live system, two --machdep command line arguments are required, at least for the time being. The arguments are: --machdep phys_offset=<base physical address> --machdep kimage_voffset=<kernel kimage_voffset value> Without the patch, any attempt to analyze a Linux 4.6 ARM64 kernel fails during initialization with a stream of "read error" messages followed by "crash: vmlinux and vmcore do not match!". (takahiro.akashi@xxxxxxxxxx) - Linux 3.15 and later kernels configured with CONFIG_RANDOMIZE_BASE could be identified because of the "randomize_modules" kernel symbol, and if it existed, the "--kaslr=<offset>" and/or "--kaslr=auto" options were unnecessary. Since the "randomize_modules" symbol was removed in Linux 4.1, this patch has replaced the KASLR identifier with the "module_load_offset" symbol, which was also introduced in Linux 3.15, but still remains. (anderson@xxxxxxxxxx) - Improvement of the ARM64 "bt -f" display such that in most cases, each stack frame level delimiter will be set to the stack address location containing the old FP and old LR pair. (takahiro.akashi@xxxxxxxxxx) - Fix for the introduction of ARM64 support for 64K pages with 3-level page tables in crash-7.1.5, which fails to translate user space virtual addresses. Without the patch, "vtop <user-space address>" fails to translate all user-space addresses, and any command that needs to either translate or read user-space memory, such as "vm -p", "ps -a", and "rd -u" will fail. (anderson@xxxxxxxxxx) - Enhancement of the error message generated by the "tree -t radix" option when a duplicate entry is encountered. Without the patch, the error message shows the address of the radix_tree_node that contains the duplicate entry, for example, "tree: duplicate tree entry: <radix_tree_node>". It has been changed to also display the radix_tree_node.slots[] array index and the duplicate entry value, for example, "tree: duplicate tree entry: radix_tree_node: <radix_tree_node> slots[<index>]: <entry>". (anderson@xxxxxxxxxx) - Introduction of a new "bt -v" option that checks the kernel stack of all tasks for evidence of stack overflows. It does so by verifying the thread_info.task address, ensuring the thread_info.cpu value is a valid cpu number, and checking the end of the stack for the STACK_END_MAGIC value. (anderson@xxxxxxxxxx) - Fix to recognize a kernel thread that has user space virtual memory attached to it. While kernel threads typically do not have an mm_struct referencing a user-space virtual address space, they can either temporarily reference one for a user-space copy operation, or in the case of KVM "vhost" kernel threads, keep a reference to the user space of the "quem-kvm" task that created them. Without the patch, they will be mistaken for user tasks; the "bt" command will display an invalid kernel-entry exception frame that indicates "[exception RIP: unknown or invalid address]", the "ps" command will not enclose the command name with brackets, and the "ps -[uk]" and "foreach [user|kernel]" options will show the kernel thread as a user task. (anderson@xxxxxxxxxx) - Fix for the "bt -[eE]" options on ARM64 to recognize kernel exception frames in VHE enabled systems, in which the kernel runs in EL2. (takahiro.akashi@xxxxxxxxxx) - Fix for the extensions/trace.c extension module to account for the Linux 4.7 kernel commit dcb0b5575d24 that changed the bit index for the TRACE_EVENT_FL_TRACEPOINT flag. Without the patch, the "extend" command fails to load the trace.so module, with the error message "extend: /path/to/crash/extensions/trace.so: no commands registered: shared object unloaded". The patch reads the flag's enum value dynamically instead of using a hard-coded value. (namhyung@xxxxxxxxx) - Incorporated Takahiro Akashi's alternative backtrace method as a "bt" option, which can be accessed using "bt -o", and where "bt -O" will toggle the original and optional methods as the default. The original backtrace method has adopted two changes/features from the optional method: (1) ORIG_X0 and SYSCALLNO registers are not displayed in kernel exception frames. (2) stackframe entry text locations are modified to be the PC address of the branch instruction instead of the subsequent "return" PC address contained in the stackframe link register. Accordingly, these are the essential differences between the original and optional methods: (1) optional: the backtrace will start with the IPI exception frame located on the process stack. (2) original: the starting point of backtraces for the active, non-crashing, tasks, will continue to have crash_save_cpu() on the IRQ stack as the starting point. (3) optional: the exception entry stackframe adjusted to be located farther down in the IRQ stack. (4) optional: bt -f does not display IRQ stack memory above the adjusted exception entry stackframe. (5) optional: may display "(Next exception frame might be wrong)". (takahiro.akashi@xxxxxxxxxx, anderson@xxxxxxxxxx) - Fix for the failure of the "sym <symbol>" option in the extremely unlikely case where the symbol's name string is composed entirely of hexadecimal characters. For example, without the patch, "sym e820" fails with the error message "sym: invalid address: e820". (anderson@xxxxxxxxxx) - Fix for the failure of the "dis <symbol>" option in the extremely unlikely case where the symbol's name string is composed entirely of hexadecimal characters. For example, without the patch, "dis f" fails with the error message "dis: WARNING: f: no associated kernel symbol found" followed by "0xf: Cannot access memory at address 0xf". (anderson@xxxxxxxxxx) - Fix for the X86_64 "bt -R <symbol>" option if the only reference to the kernel text symbol in a backtrace is contained within the "[exception RIP: <symbol+offset>]" line of an exception frame dump. Without the patch, the reference will only be picked up if the exception RIP's hexadecimal address value is used. (anderson@xxxxxxxxxx) - Fix for the ARM64 "bt -R <symbol>" option if the only reference to the kernel text symbol in a backtrace is contained within the "[PC: <address> [<symbol+offset>]" line of an exception frame dump. Without the patch, the reference will only be picked up if the PC's hexadecimal address value is used. (anderson@xxxxxxxxxx) - Fix for the gathering of module symbol name strings during session initialization. In the unlikely case where the ordering of module symbol name strings does not match the order of the kernel_symbol structures, a faulty module symbol list entry may be created that contains a bogus name string. (sebastien.piechurski@xxxxxxxx) - Fix the PERCENTAGE of total output of the "kmem -i" SWAP USED line when the system has no swap pages at all. Without the patch, both the PAGES and TOTAL columns show values of zero, but it confusingly shows "100% of TOTAL SWAP", which upon first glance may seem to indicate potential memory pressure. (jsiddle@xxxxxxxxxx) - Enhancement to determine structure member data if the member is contained within an anonymous structure or union. Without the patch, it is necessary to parse the output of a discrete gdb "printf" command to determine the offset of such a structure member. (Alexandr_Terekhov@xxxxxxxx) - Speed up session initialization by attempting MEMBER_OFFSET_INIT() before falling back to ANON_MEMBER_OFFSET_INIT() in several known cases of structure members that are contained within anonymous structures. (anderson@xxxxxxxxxx) - Implemented new "list -S" and "tree -S" options that are similar to each command's -s option, but instead of parsing gdb output, member values are read directly from memory, so the command is much faster for 1-, 2-, 4-, and 8-byte members. (Alexandr_Terekhov@xxxxxxxx) - Fix to recognize and support x86_64 Linux 4.8-rc1 and later kernels that are configured with CONFIG_RANDOMIZE_MEMORY, which randomizes the base addresses of the kernel's unity-map address (PAGE_OFFSET), and the vmalloc region. Without the patch, the crash utility fails with a segmentation violation during session initialization on a live system, or will generate a number of WARNING messages followed by the fatal error message "crash: vmlinux and <dumpfile name> do not match!" with dumpfiles. (anderson@xxxxxxxxxx) - Fix for Linux 4.1 commit d0a0de21f82bbc1737ea3c831f018d0c2bc6b9c2, which renamed the x86_64 "init_tss" per-cpu variable to "cpu_tss". Without the patch, the addresses of the 4 per-cpu exception stacks cannot be determined, which causes backtraces that originate on any of the per-cpu DOUBLEFAULT, NMI, DEBUG, or MCE stacks to be truncated. (anderson@xxxxxxxxxx) - With the introduction of radix MMU in Power ISA 3.0, there are changes in kernel page table management accommodating it. This patch series makes appropriate changes here to work for such kernels. Also, this series fixes a few bugs along the way: ppc64: fix vtop page translation for 4K pages ppc64: Use kernel terminology for each level in 4-level page table ppc64/book3s: address changes in kernel v4.5 ppc64/book3s: address change in page flags for PowerISA v3.0 ppc64: use physical addresses and unfold pud for 64K page size ppc64/book3s: support big endian Linux page tables The patches are needed for Linux v4.5 and later kernels on all ppc64 hardware. (hbathini@xxxxxxxxxxxxxxxxxx) - Fix for Linux 4.8-rc1 commit 500462a9de657f86edaa102f8ab6bff7f7e43fc2, in which Thomas Gleixner redesigned the kernel timer mechanism to switch to a non-cascading wheel. Without the patch, the "timer" command fails with the message "timer: zero-size memory allocation! (called from <address>)" (anderson@xxxxxxxxxx) - Support for PPC64/BOOK3S virtual address translation for radix MMU. As both radix and hash MMU are supported in a single kernel on Power ISA 3.0 based server processors, identify the current MMU type and set page table index values accordingly. Also, in Linux 4.7 and later kernels, PPC64/BOOK3S uses the same masked bit values in page table entries for 4K and 64K page sizes. (hbathini@xxxxxxxxxxxxxxxxxx) - Change the RESIZEBUF() macro so that it will accept buffer pointers that are not declared as "char *" types. Change two prior direct callers of resizebuf() to use RESIZEBUF(), and fix two prior users of RESIZEBUF() to correctly calculate the need to resize their buffers. (anderson@xxxxxxxxxx) - Fix for the "trace.so" extension module to properly recognize Linux 3.15 and later kernels. In crash-7.1.6, the MEMBER_OFFSET() macro has been improved so that it is able to recognize members of embedded anonymous structures. However, the module's manner of recognizing Linux 3.15 and later kernels depended upon MEMBER_OFFSET() failing to handle anonymous members, and therefore the improvement prevented the module from successfully loading. (rabinv@xxxxxxxx) - If a "struct" command address argument is expressed using the per-cpu "symbol:cpuspec" format, and the symbol is a pointer type, i.e., not the address of the structure, display a WARNING message. (atomlin@xxxxxxxxxx) - Exclude ARM64 kernel module linker mapping symbols like "$d" and "$x" as is done with 32-bit ARM. Without the patch, a crash session may fail during the "gathering module symbol data" stage with a message similar to "crash: store_module_symbols_v2: total: 15 mcnt: 16". (takahiro.akashi@xxxxxxxxxx) - Enhancement to the ARM64 "dis" command when the kernel has enabled KASLR. When KASLR is enabled on ARM64, a function call between a module and the base kernel code will be done via a veneer (PLT) if the displacement is more than +/-128MB. As a result, disassembled code will show a branch to the in-module veneer location instead of the in-kernel target location. To avoid confusion, the output of the "dis" command will translate the veneer location to the target location preceded by "plt:", for example, "<plt:printk>". (takahiro.akashi@xxxxxxxxxx) - Improvement of the "dev -d" option to display I/O statics for disks whose device driver uses the blk-mq interface. Currently "dev -d" always displays 0 in all fields for the blk-mq disk because blk-mq does not increment/decrement request_list.count[2] on I/O creation and I/O completion. The following values are used in blk-mq in such situations: - I/O creation: blk_mq_ctx.rq_dispatched[2] - I/O completion: blk_mq_ctx.rq_completed[2] So, we can get the counter of in-progress I/Os as follows: in progress I/Os == rq_dispatched - rq_completed This patch displays the result of above calculation for the disk. It determines whether the device driver uses blk-mq if the request_queue.mq_ops is not NULL. The "DRV" field is displayed as "N/A(MQ)" if the value for in-flight in the device driver does not exist for blk-mq. (m.mizuma@xxxxxxxxxxxxxx) -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility