----- "ville mattila" <ville.mattila@xxxxxxxxxxxxx> wrote: > crash-utility-bounces@xxxxxxxxxx wrote on 14.01.2010 16:08:41: > > > From: > > > > Dave Anderson <anderson@xxxxxxxxxx> > > > > To: > > > > ----- "ville mattila" <ville.mattila@xxxxxxxxxxxxx> wrote: > > > > > Hello, > > > > > > I get segementation fault from our 64-bit kernel crash > > > This crash is caused by "echo c > /proc/sys-trigger". > > > The reason seems to be that the x86_64_cpu_pda_init is > > > not called at least gdb do not break there. > > > > > > Here is a little patch that fixes it. Everyting seems to > > > work correctly. I'll provide more info if needed. > > > > > > > > > --- crash-5.0.0/x86_64.c 2010-01-06 21:38:27.000000000 +0200 > > > +++ crash-5.0.0-64bit/x86_64.c 2010-01-14 08:24:13.679603706 +0200 > > > @@ -6325,6 +6325,12 @@ x86_64_get_active_set(void) > > > > > > ms = machdep->machspec; > > > > > > + if (!ms->current) { > > > + error(INFO, "%s: Cannot get active set, ms->current is NULL\n", > > > + __func__); > > > + return; > > > + } > > > + > > > > That patch just masks the real problem. > > > > What kernel version is it? > > > > If it's 2.6.30 or later, then x86_64_per_cpu_init() should > > be called, otherwise x86_64_cpu_pda_init() is called. And > > whichever one that gets called should allocate the array. > > > > 2.6.30 or later kernels should show: > > > > crash> struct x8664_pda > > struct: invalid data structure reference: x8664_pda > > crash> > > > > and they will use x86_64_per_cpu_init(). > > > > Kernels prior to 2.6.30 should show: > > > > crash> struct x8664_pda > > struct x8664_pda { > > struct task_struct *pcurrent; > > long unsigned int data_offset; > > long unsigned int kernelstack; > > long unsigned int oldrsp; > > long unsigned int debugstack; > > int irqcount; > > int cpunumber; > > char *irqstackptr; > > int nodenumber; > > unsigned int __softirq_pending; > > unsigned int __nmi_count; > > int mmu_state; > > struct mm_struct *active_mm; > > unsigned int apic_timer_irqs; > > } > > SIZE: 128 > > crash> > > > > and they will use x86_64_cpu_pda_init(). > > > > If you're having trouble with gdb, can you put some fprintf(fp, ...) > > calls in the relevant function and find out why it isn't doing > > the calloc() call? > > > Yes I thought so. This is a customized 2.6.31.7 kernel.org > kernel. This is a UP configuration e.g. CONFIG_SMP is n. > I think the problem is that the PER_CPU_OFF is not set. Ahah -- that would do it. UP x86_64 kernels are so rare that apparently nobody ever noticed, and I don't have a UP x86_64 vmcore to even test with. (RHEL5 doesn't even ship a UP x86_64 kernel). Anyway, that change went into 4.0-8.11. And as far as I can tell, x86_64_per_cpu_init() should still populate the single "ms->current[0]" task from the "per_cpu__current_task" symbol from UP kernels -- which doesn't need the PER_CPU_OFF translation mechanism. In other words, I think you should be able to do this on your UP kernel: crash> px per_cpu__current_task and it should show the panic task address that comes up as the current task upon invocation. Is that right? > Btw, the "struct" command caused another segementation fault. > Here is gdb bt: > > (gdb) bt > #0 0x00007f74b3524a92 in strcmp () from /lib/libc.so.6 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0 > "x8664_pda") > at symtab.c:276 > #2 0x00000000005344ed in lookup_symtab (name=0x120e3c0 "x8664_pda") > at symtab.c:228 > #3 0x000000000060019d in c_lex () at c-exp.y:2149 > #4 0x00000000006008f5 in c_parse_internal () at c-exp.c.tmp:1468 > #5 0x00000000006022dd in c_parse () at c-exp.y:2225 > #6 0x000000000055f614 in parse_exp_in_context > (stringptr=0x7fffbc2f2260, > block=<value optimized out>, comma=<value optimized out>, > void_context_p=0, out_subexp=0x0) at parse.c:1094 > #7 0x000000000055f924 in parse_expression (string=0x7fffbc2f2950 > "x8664_pda") > at parse.c:1144 > #8 0x000000000053291b in gdb_command_funnel (req=0xca2c00) at > symtab.c:4992 > #9 0x00000000004c1740 in gdb_interface (req=0xca2c00) at > gdb_interface.c:407 > #10 0x00000000004e9dca in datatype_info (name=0xb618a7 "x8664_pda", > member=0x0, dm=0x7fffbc2f3620) at symbols.c:4146 > #11 0x00000000004eb1ee in arg_to_datatype (s=0xb618a7 "x8664_pda", > dm=0x7fffbc2f3620, flags=524290) at symbols.c:4867 > #12 0x00000000004efa1b in cmd_datatype_common (flags=2048) at > symbols.c:4664 > #13 0x000000000045efd9 in exec_command () at main.c:644 > #14 0x000000000045f1fa in main_loop () at main.c:603 > #15 0x00000000005452a9 in captured_command_loop (data=0x120e3c0) > at ./main.c:226 > #16 0x00000000005434e4 in catch_errors (func=0x5452a0 > <captured_command_loop>, > func_args=0x0, errstring=0x7f9d7c "", mask=<value optimized out>) > at exceptions.c:520 > #17 0x0000000000544d36 in captured_main (data=<value optimized out>) > at ./main.c:924 > #18 0x00000000005434e4 in catch_errors (func=0x544340 <captured_main>, > func_args=0x7fffbc2f38b0, errstring=0x7f9d7c "", > mask=<value optimized out>) at exceptions.c:520 > #19 0x000000000054412f in gdb_main_entry (argc=<value optimized out>, > argv=<value optimized out>) at ./main.c:939 > #20 0x000000000045fece in main (argc=3, argv=0x7fffbc2f3a08) at > main.c:517 > (gdb) frame 1 > #1 0x0000000000534284 in lookup_partial_symtab (name=0x120e3c0 > "x8664_pda") > at symtab.c:276 > 276 if (FILENAME_CMP (name, pst->filename) == 0) > (gdb) p name > $4 = 0x120e3c0 "x8664_pda" > (gdb) p pst > $5 = (struct partial_symtab *) 0x14d6040 > (gdb) p pst->filename > $6 = 0x0 > (gdb) p *pst > $7 = {next = 0x0, filename = 0x0, fullname = 0x0, dirname = 0x0, > objfile = 0x0, section_offsets = 0x0, textlow = 0, texthigh = 0, > dependencies = 0x0, number_of_dependencies = 0, globals_offset = 0, > n_global_syms = 0, statics_offset = 0, n_static_syms = 0, symtab = > 0x0, > read_symtab = 0, read_symtab_private = 0x0, readin = 0 '\0'} > (gdb) > > > I fixed it with the patch below: > -- crash-5.0.0/gdb-7.0/gdb/symtab.c 2010-01-15 10:41:00.919973440 > +0200 > +++ crash-5.0.0-64bit/gdb-7.0/gdb/symtab.c 2010-01-15 > 10:19:21.436128740 +0200 > @@ -256,7 +256,7 @@ got_symtab: > struct partial_symtab * > lookup_partial_symtab (const char *name) > { > - struct partial_symtab *pst; > + struct partial_symtab *pst = NULL; > struct objfile *objfile; > char *full_path = NULL; > char *real_path = NULL; > @@ -273,7 +273,7 @@ lookup_partial_symtab (const char *name) > > ALL_PSYMTABS (objfile, pst) > { > - if (FILENAME_CMP (name, pst->filename) == 0) > + if (pst->filename && FILENAME_CMP (name, pst->filename) == 0) > { > return (pst); > } > @@ -311,7 +311,7 @@ lookup_partial_symtab (const char *name) > if (lbasename (name) == name) > ALL_PSYMTABS (objfile, pst) > { > - if (FILENAME_CMP (lbasename (pst->filename), name) == 0) > + if (pst->filename && FILENAME_CMP (lbasename (pst->filename), name) > == 0) > return (pst); > } Weird -- so you're apparently able to do that when running any "struct <non-existent>" command from the crash command line? But I can't reproduce that -- this is what should happen: crash> struct this_is_junk struct: invalid data structure reference: this_is_junk crash> and I don't understand what could be different with your custom kernel? > > > > Either that, or if you can make the vmlinux/vmcore pair available > > for me to download, I can look at it. > > I'll arrange this if the above information is not enough. Yes please -- can you put the vmlinux/vmcore pair somewhere where I can download it? You can send me the particulars off-line to anderson@xxxxxxxxxxx Thanks, Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility