----- "Hedi Berriche" <hedi@xxxxxxx> wrote: > Context: > > - crash-5.0.1 > - glibc 2.4 > - vmcore produced by x86_64 sles11 2.6.27.19-5-default > > Problem: > > crash> mod -s xfs /usr/people/hedi/xfs.ko.debug > mod: xfs: last symbol is not _MODULE_END_xfs? > *** glibc detected *** /tr/x86_64/bin/crash: double free or corruption > (!prev): 0x0000000001558760 *** > <segmentation violation in gdb> > mod: /usr/people/hedi/xfs.ko.debug > gdb add-symbol-file command failed > > hangs solid there and has to be killed with SIGKILL. > > Grabbing a core reveals the following: > > (gdb) bt f > #0 0x00002b628cd0ebb5 in raise () from /lib64/libc.so.6 > #1 0x00002b628cd0ffb0 in abort () from /lib64/libc.so.6 > #2 0x00002b628cd4a340 in malloc_printerr () from /lib64/libc.so.6 > #3 0x00000000005454af in parse_exp_in_context (stringptr=0x400000000, > block=<value optimized out>, comma=<value optimized out>, > void_context_p=0, out_subexp=0x7b4760) > at parse.c:1101 > except = {reason = RETURN_ERROR, error = GENERIC_ERROR, > message = 0x1c790a0 "Dwarf Error: Could not find abbrev number 188 [in > module /usr/people/hedi/xfs.ko.debug]"} > old_chain = (struct cleanup *) 0x0 > subexp = <value optimized out> > #4 0x000000060000000b in ?? () > #5 0x0000000000000000 in ?? () > > (gdb) f 3 > #3 0x00000000005454af in parse_exp_in_context (stringptr=0x400000000, > block=<value optimized out>, comma=<value optimized out>, > void_context_p=0, out_subexp=0x7b4760) > at parse.c:1101 > 1101 xfree (expout); > > (gdb) list > 1096 } > 1097 if (except.reason < 0) > 1098 { > 1099 if (! in_parse_field) > 1100 { > 1101 xfree (expout); > 1102 throw_exception (except); > 1103 } > 1104 } > 1105 > > Not sure (yet) whether the error > > mod: xfs: last symbol is not _MODULE_END_xfs? > Dwarf Error: Could not find abbrev number 188 [in module /usr/people/hedi/xfs.ko.debug] > > is a problem in crash or in the xfs.ko.debug objfile but that's another story, > the problem here is that crash shouldn't crash. > > > FWIW, this problem is most definitely a regression, indeed crash version > 4.-8.11, for example, fails to load the objfile, with exactly the same error > message, with the notable difference that it does *not* crash. Agreed on all counts. It's crashing now because of the gdb-7.0 integration, and the attached patch should fix that. As far as the embedded "add-symbol-file" failure to load the module, you're right, that's another issue, and what I can suggest is this: crash> set debug 1 crash> mod -s xfs /usr/people/hedi/xfs.ko.debug and you will see the full "add-symbol-file" gdb command string that's failing. For that matter you can take that full string, remove crash from the picture entirely, and just enter it into a gdb session: $ gdb ... add-symbol-file arg arg arg... It looks like some kind of Dwarf issue though, and I can't help with that. However, at least on a RHEL environment, the argument to the mod command should be the stripped module.ko file, and the module.ko.debug file gets found automatically, and the two pieces put together. In other words, taking the "ext3" module, my RHEL5 environment has: /lib/modules/2.6.18-128.el5/kernel/fs/ext3/ext3.ko /usr/lib/debug/lib/modules/2.6.18-128.el5/kernel/fs/ext3/ext3.ko.debug And when it gets loaded, the base "ext3.ko" file is used as the internal argument to the gdb "add-symbol-file" command: crash> mod -s ext3 MODULE NAME SIZE OBJECT FILE ffffffff8806ae00 ext3 168017 /lib/modules/2.6.18-128.el5/kernel/fs/ext3/ext3.ko crash> I wonder if you would still see the same issue if you used the base "xfs.ko" file instead of "xfs.ko.debug"? For the first time I saw one of those (harmless) "last symbol is not _MODULE_END_xxx" messages on a 2.6.32 x86 kernel the other day. I'll look into that. And lastly: > P.S. The "last symbol is not _MODULE_END_<modulename>" has been reported > back in Jan 2009 (albeit with the difference that crash would load the > objfile despite the error message) > > https://www.redhat.com/archives/crash-utility/2009-January/msg00070.html > > but I am not sure the root cause was identified back then, or at least I am > failing to find, in the list archives, any proof of that. I don't know what the deal was with that... Dave
Index: symbols.c =================================================================== RCS file: /nfs/projects/cvs/crash/symbols.c,v retrieving revision 1.200 diff -u -r1.200 symbols.c --- symbols.c 5 Feb 2010 16:19:18 -0000 1.200 +++ symbols.c 23 Feb 2010 13:40:52 -0000 @@ -8547,7 +8547,7 @@ FREEBUF(req->buf); sprintf(buf, "set complaints 0"); - gdb_pass_through(buf, NULL, 0); + gdb_pass_through(buf, NULL, GNU_RETURN_ON_ERROR); return(!(req->flags & GNU_COMMAND_FAILED)); }
-- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility