- Kdump ELF vmcores contain NT_PRSTATUS notes for online cpus only, so if cpus have been offlined prior to a crash, there will be fewer notes than the number of cpus in the system, and therefore there will not be a one-to-one correlation between each cpu and its associated NT_PRSTATUS note. That causes backtrace failures for architectures like ppc64 that depend upon the contents of the NT_PRSTATUS notes for gathering the starting stack location. (chandru@xxxxxxxxxx, anderson@xxxxxxxxxx) - Fix and enhancement for the "dev" command. When the command was run against 2.6.26 or later kernels, it would fail with the error message "dev: invalid structure member offset: char_device_struct_fops". Additionally, even when the command did work, more often than not it would fail to determine the file_operations structure associated with the block or character device, and erroneously display "(none)" or "(unused)". This patch makes a more comprehensive search for the file_operations structure, and instead of just displaying its address and symbolic translation, it will display the address of the data structure that contains the pointer to the file_operations structure, along with the symbolic translation of the file_operations structure. For character devices, the containing structure is a "cdev", and for block devices the containing structure is a "gendisk". The command output adds new CDEV and GENDISK columns, and under the OPERATIONS column is the symbolic translation of its file_operations structure. (anderson@xxxxxxxxxx, bob.montgomery@xxxxxx) - Fix for a potential segmentation violation when running "foreach bt" on a very active live system with many processes starting and ending. Without the patch, a segmentation violation could occur when a "bt" was attempted on a task that had become non-existent. This would happen on x86_64 or ppc64 machines, and was due to the usage of a kernel stack pointer taken from a stale/invalid task_struct. The command will now recognize the bad stack pointer and display the error message "bt: task no longer exists" or "bt: invalid/stale stack pointer for this task: <address>". (anderson@xxxxxxxxxx) - Fix to correctly read LKCD Version 8 and later x86 dumpfile headers. (talk90091e@xxxxxxxxx) - If a kdump NMI issued to a non-crashing x86_64 cpu was received while running in schedule(), after having set the next task as "current" in the cpu's runqueue, but prior to changing the kernel stack to that of the next task, then a backtrace would fail to make the transition from the NMI exception stack back to the process stack, with the error message "bt: cannot transition from exception stack to current process stack". This patch will report inconsistencies found between a task marked as the current task in a cpu's runqueue, and the task found in the per-cpu x8664_pda "pcurrent" field (2.6.29 and earlier) or the per-cpu "current_task" variable (2.6.30 and later). If it can be safely determined that the runqueue setting (used by default) is premature, then the crash utility's internal per-cpu active task will be changed to be the task indicated by the appropriate architecture specific value. Also, a new "set -a <task>" option has been added to manually set a task to be the "active" task on its cpu. (anderson@xxxxxxxxxx) - Fix for x86_64 "bt" command when transitioning from the IRQ stack back to the process stack on 2.6.29 and later kernels. Without the patch, the interrupt exception frame address on the process stack would be incorrectly determined, and its display would typically be preceded by "[exception RIP: unknown or invalid address]", and the backtrace would fail from that point on. (anderson@xxxxxxxxxx) - Enhancement to the "runq" command to show the current task in each cpu's runqueue, plus a few formatting changes to make the output easier to understand. (anderson@xxxxxxxxxx) - Fix for a memory leak when running on live systems, due to the repetitive reallocation of the internal array of active tasks. (anderson@xxxxxxxxxx) - Fix for usage with vmlinux debuginfo files using Dwarf 3 format, for example, the Fedora 2.6.31-0.24.rc0.git18.fc12 kernel. Without the patch, the crash session fails during initialization with the error message: "Dwarf Error: wrong version in compilation unit header (is 3, should be 2) [in module <path-to>/vmlinux]", followed by the erroneous message "crash: <path-to>/vmlinux: no debugging data available". The patch simply accepts the Dwarf 3 header, and the embedded gdb-6.1 version still appears to work with the updated vmlinux debuginfo file format. (anderson@xxxxxxxxxx) - Fix for faulty invocation failure when a System.map file is used as an argument with a compressed diskdump or compressed kdump dumpfile. If the System.map argument appears after the vmcore file on the command line, as in: "crash vmcore System.map vmlinux", the crash session fails immediately with the error message: "crash: vmcore: initialization failed". With the patch, the arguments may be entered in any order. (anderson@xxxxxxxxxx) - Fix for a potential segmentation violation during invocation if a vmcore file, a System.map file, and a non-matching vmlinux file are used as command line arguments. The problem is that whenever a System.map file is used, it is presumed that the user knows what he is doing, and that the vmlinux file is not the same as the kernel that generated the vmcore; therefore the vmlinux/vmcore matching and verification routines are not performed. However, if the kernel data structures in the non-matching vmlinux vary widely enough from the kernel that generated the vmcore, all manners of bogus data may be read and consumed. The reported segmentation violation occurred when using a vmcore created from a "stock" Red Hat kernel with a vmlinux file from a Red Hat "debug" kernel, where the kernel data structures are significantly different. The patch adds a several new defensive mechanisms, and displays additional warning messages, when invalid or questionable data is read, and as a result the crash session will fail in a more reasonable manner. (anderson@xxxxxxxxxx) - Adjusted several virtual and physical memory address definitions for 2.6.31 x86_64 kernels: MAX_PHYSMEM_BITS, VMALLOC_START, VMALLOC_END, VMEMMAP_VADDR, VMEMMAP_END, MODULES_VADDR and MODULES_END. Without the patch, when run against CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels, the "kmem -i" option would hang, and when run against CONFIG_SLUB and CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels, the "kmem -s" option would report numerous errors indicating "kmem: read error: kernel virtual address: <address> type: page inuse", where the <address> was a legitimate virtual-memmap page structure address. (anderson@xxxxxxxxxx) - Improvement for CONFIG_SLUB "kmem -s" or "kmem -S" options when an invalid slab page link address is encountered. Without the patch, the commands fail with a generic "invalid kernel virtual address" read error message, and "kmem -s" would not display any previously collected statistics. With the patch, the error message displays the slab cache name, the list type, and the invalid pointer found, for example, "kmem: dentry: partial list: page.lru.next: 100100". (anderson@xxxxxxxxxx) Download from: http://people.redhat.com/anderson -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility