Hi Joe, It pretty clear it's due to this change in 5.1.5: - Implemented the capability of using the NT_PRSTATUS ELF note data that is saved in version 4 compressed kdump headers to determine the starting stack and instruction pointer hooks for x86 and x86_64 backtraces when they cannot be determined in the traditional manners. (wang.chao@xxxxxxxxxxxxxx, wency@xxxxxxxxxxxxxx) What happens if you run it like so: $ crash --no_elf_notes vmlinux vmcore As far as this message: WARNING: sparsemem: invalid section number: 137438888923 That should be outside the realm of Fujitsu's ELF notes patch. Does this kernel have some kind of Stratus VM modification? Dave ----- Original Message ----- > > Crash faults when determining panic task > > I have a vmcore generated on RHEL6.1 that newer versions of crash > have trouble analyzing (5.1.1-2.el6 seems to work ok) . > > > > I can provide additional binary files if needed, just let me know > what convention best suits the list (ftp, private email attachment, > etc.) > > > > Crash Version : OS: Result: > > crash 5.1.8 Debian wheezy faults > > crash 5.1.7-1.el6 RHEL6.2 Alpha faults > > crash 5.1.1-2.el6 RHEL6.1 ok > > > Kernel: > > 2.6.32-131.0.15.el6.exp10.bz16586.x86_64 ( 2.6.32-131.0.15 + a fix > for Red Hat bz - 707268) > > > Interesting warnings when starting crash: > > WARNING: sparsemem: invalid section number: 137438888923 > > WARNING: sparsemem: invalid section number: 137438888923 > > > First fault, null pointer deference: > > please wait... (determining panic task) > > Program received signal SIGSEGV, Segmentation fault. > > x86_64_get_dumpfile_stack_frame (rsp=0x7fffffffcc58, > rip=0x7fffffffcc50, > > bt_in=0x7fffffffcce0) at x86_64.c:4183 > > 4183 ur_rip = ULONG(user_regs + > > (gdb) p user_regs > > $1 = 0x0 > > > Workaround, check that bt->machdep is not NULL: > > diff -Nupr crash-5.1.8/x86_64.c crash-5.1.8.new/x86_64.c > > --- crash-5.1.8/x86_64.c 2011-09-16 15:01:12.000000000 -0400 > > +++ crash-5.1.8.new/x86_64.c 2011-09-28 14:12:45.347188571 -0400 > > @@ -4178,7 +4178,7 @@ x86_64_get_dumpfile_stack_frame(struct b > > goto skip_stage; > > } > > } > > - } else if (ELF_NOTES_VALID()) { > > + } else if (ELF_NOTES_VALID() && bt->machdep) { > > user_regs = bt->machdep; > > ur_rip = ULONG(user_regs + > > OFFSET(user_regs_struct_rip)); > > > Second fault, a curiously large n_descsz in elf note header: > > please wait... (determining panic task) > > Program received signal SIGSEGV, Segmentation fault. > > get_regs_from_note (note=0xd26472 "\b", ip=0x7fffffffc4e0, > sp=0x7fffffffc4e8) > > at netdump.c:2221 > > 2221 *sp = ULONG(user_regs + offset_sp); > > (gdb) p *(Elf64_Nhdr *)note > > $1 = {n_namesz = 8, n_descsz = 3438804992, n_type = 8} > > > Workaround, do not attempt reading registers from elf notes (this > chunk of code was not present in crash 5.1.1): > > diff -Nupr crash-5.1.8/netdump.c crash-5.1.8.new/netdump.c > > --- crash-5.1.8/netdump.c 2011-09-16 15:01:12.000000000 -0400 > > +++ crash-5.1.8.new/netdump.c 2011-09-28 14:14:43.687183734 -0400 > > @@ -2286,7 +2286,7 @@ get_netdump_regs_x86_64(struct bt_info * > > > > bt->machdep = (void *)user_regs; > > } > > - > > +#if 0 > > if (ELF_NOTES_VALID() && > > (bt->flags & BT_DUMPFILE_SEARCH) && DISKDUMP_DUMPFILE() && > > (note = (Elf64_Nhdr *) > > @@ -2305,7 +2305,7 @@ get_netdump_regs_x86_64(struct bt_info * > > > > bt->machdep = (void *)user_regs; > > } > > - > > +#endif > > machdep->get_stack_frame(bt, ripp, rspp); } > > > Given the warning messages at the beginning of the process, I'm sure > if I' m dealing with a corrupted or incomplete vmcore image. Let me > know what additional info could be useful if this seems worth > debugging further. > > > > Thanks, > > -- Joe Lawrence > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/crash-utility > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility