----- "John Wright" <john.wright@xxxxxx> wrote: > Hello, > > I have a dump that causes crash to segfault: > > $ crash kernel_link dump.201002031023 > > crash 5.0.0 > Copyright (C) 2002-2010 Red Hat, Inc. > Copyright (C) 2004, 2005, 2006 IBM Corporation > Copyright (C) 1999-2006 Hewlett-Packard Co > Copyright (C) 2005, 2006 Fujitsu Limited > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > Copyright (C) 2005 NEC Corporation > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > This program is free software, covered by the GNU General Public License, > and you are welcome to change it and/or distribute copies of it under > certain conditions. Enter "help copying" to see the conditions. > This program has absolutely no warranty. Enter "help warranty" for details. > > GNU gdb (GDB) 7.0 > Copyright (C) 2009 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-unknown-linux-gnu"... > > please wait... (gathering module symbol data)Segmentation fault > (core dumped) > > Interestingly, it doesn't crash every time. But I tracked it down to a > change in Linux 2.6.32 (see [1]) which changes how the module symbol > table is stored. It only seems to make a difference when a module is > still coming up. > > Here's what's happening, as I understand it: > > - crash loads the contents of a struct module from memory into a buffer > (pointed to by a char * called modbuf) > - It gets some of the fields out of it and stores it in an internal > struct load_module lm. Of importance here: > - mod_base : the kernel address of the core of the module, > corresponding to the module_core member of struct module > - mod_size : the core_size element of the struct module > - It allocates a buffer of size mod_size, called module_buf, and reads > mod_size bytes from the kernel address mod_base (from the dumpfile) > into module_buf > - Then, it gets the kernel address for the symbol table from modbuf, > given by the symtab member of struct module, storing it in a variable > called ksymtab > - Then, it tries to get the local address of the symbol table via > > locsymtab = module_buf + (ksymtab - lm->mod_base); > > In other words, it's getting the offset between mod->symtab and > mod->module_core, so that it can treat module_buf as if it were > module_core. > > But that calculation only works if the symbol table is actually in the > module core. As it turns out, in Linux 2.6.32, the symbol table > handling works a little differently: mod->symtab points to somewhere > inside mod->module_init until after its init function finishes running. > After that, it gets set to the value of mod->symtab_core. > > Anyway, since the module's init section wasn't loaded into module_buf, > this won't work for modules that are in the process of loading. It > would be fairly simple to stop it from crashing by making sure locsymtab > doesn't point outside the area allocated for module_buf (patch > attached). Actually making sure it loads the symbol table would be a > little more tricky, involving loading the init section as well, but > maybe that's not necessary (hopefully someone debugging a module that > crashes in its init function would be able to load debugging symbols for > the module after crash loads). > > I've attached a simple module that panics the kernel in its init > function, and a patch to crash that just ignores the ksymtab and kstrtab > information if they point outside of the module core. > > [1]: > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4a4962263f07d14660849ec134ee42b63e95ea9a > > -- > +-----------------------------------------+ > | John Wright <john.wright@xxxxxx> | > | Hewlett-Packard Telco Platform Software | > +-----------------------------------------+ Nice -- makes perfect sense. Somewhat similar to a 4.0-7.1 change where a module was kmalloc'ing its own exported symbol list outside of its own virtual address space, and then overwriting its own symbol list pointer. And again, thanks for doing the heavy lifting. Queued for the next release. Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility