On 2023/11/14 17:49, Tao Liu wrote: > There is an issue that, for kernel modules loaded by mod -s/-S, "dis -rl" fails > to display module's code line number data after execute "bt" cmd in crash. > > Without the patch: > crsah> mod -S > crash> bt > PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" > #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 > ...snip... > #7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e > [exception RIP: lpfc_nlp_get+210] > RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046 > RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc > RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 > RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000 > R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0 > R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] > ...snip... > crash> dis -rl ffffffffc0f60f82 > 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] > 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp > 0xffffffffc0f60eb6 <lpfc_nlp_get+6>: push %rbx > 0xffffffffc0f60eb7 <lpfc_nlp_get+7>: test %rdi,%rdi > > With the patch: > crash> mod -S > crash> bt > PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" > #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 > ...snip... > #7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e > [exception RIP: lpfc_nlp_get+210] > RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046 > RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc > RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 > RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000 > R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0 > R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] > ...snip... > crash> dis -rl ffffffffc0f60f82 > /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6756 > 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] > /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6759 > 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp > > The root cause is, after kernel module been loaded by command mod, the symtable > is not expanded in gdb side. crash command bt or dis will trigger such an > expansion. However the symtable expansion is different for the 2 commands: > > The stack trace of "dis -rl" for symtable expanding: > > #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0xe6a77a0) at symfile.c:2914 > #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector (this=<optimized out>, static_block=static_block@entry=0xfbe4b60, section=1, expandable=expandable@entry=0) at buildsym.c:1072 > #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block (this=<optimized out>, static_block=static_block@entry=0xfbe4b60, section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106 > #3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized out>, cu=0x8ee4c60) at /usr/include/c++/8/bits/unique_ptr.h:716 > #4 process_queue (per_objfile=0xc54c870) at dwarf2/read.c:9220 > #5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc54c870, skip_partial=<optimized out>) at dwarf2/read.c:2448 > #6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd0a320, per_objfile=0xc54c870, skip_partial=<optimized out>) at dwarf2/read.c:2472 > #7 0x000000000077f75e in dw2_expand_all_symtabs (objfile=<optimized out>) at dwarf2/read.c:3768 > #8 0x00000000008f254d in gdb_get_line_number (req=0x7fffffffb1f0) at symtab.c:7112 > #9 0x00000000008f22af in gdb_command_funnel_1 (req=0x7fffffffb1f0) at symtab.c:7023 > #10 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffb1f0) at symtab.c:6965 > #11 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffb1f0) at gdb_interface.c:409 > #12 0x00000000005f5bd8 in get_line_number (addr=18446744072651935408, buf=buf@entry=0x7fffffffd460 "", reserved=reserved@entry=0) at symbols.c:4440 > #13 0x000000000059e574 in cmd_dis () at kernel.c:2143 > > The stack trace of "bt" for symtable expanding: > > #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0x1ad15630) at symfile.c:2914 > #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector (this=<optimized out>, static_block=static_block@entry=0x1db0be30, section=1, expandable=expandable@entry=0) at buildsym.c:1072 > #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block (this=<optimized out>, static_block=static_block@entry=0x1db0be30, section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106 > #3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized out>, cu=0x7465240) at /usr/include/c++/8/bits/unique_ptr.h:716 > #4 process_queue (per_objfile=0xc113810) at dwarf2/read.c:9220 > #5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc113810, skip_partial=<optimized out>) at dwarf2/read.c:2448 > #6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd069d0, per_objfile=0xc113810, skip_partial=<optimized out>) at dwarf2/read.c:2472 > #7 0x000000000077f8ed in dw2_lookup_symbol (objfile=<optimized out>, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at dwarf2/read.c:3669 > #8 0x00000000008e6d03 in lookup_symbol_via_quick_fns (objfile=0xdd277a0, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2392 > #9 0x00000000008e7153 in lookup_symbol_in_objfile (objfile=0xdd277a0, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2541 > #10 0x00000000008e73c6 in lookup_symbol_global_or_static_iterator_cb (objfile=0xdd277a0, cb_data=0x7fffffffc470) at symtab.c:2615 > #11 0x00000000008b99c4 in svr4_iterate_over_objfiles_in_search_order (gdbarch=<optimized out>, cb=0x8e7342 <lookup_symbol_global_or_static_iterator_cb(objfile*, void*)>, cb_data=0x7fffffffc470, current_objfile=0x0) at solib-svr4.c:3248 > #12 0x00000000008e754e in lookup_global_or_static_symbol (name=0x7fffffffc890 "cpumask_t", block_index=STATIC_BLOCK, objfile=0x0, domain=STRUCT_DOMAIN) at symtab.c:2660 > #13 0x00000000008e75da in lookup_static_symbol (name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2678 > #14 0x00000000008e632c in lookup_symbol_aux (name=0x7fffffffc890 "cpumask_t", match_type=symbol_name_match_type::FULL, block=0x0, domain=STRUCT_DOMAIN, language=language_c, is_a_field_of_this=0x0) at symtab.c:2122 > #15 0x00000000008e5a7a in lookup_symbol_in_language (name=0x7fffffffc890 "cpumask_t", block=0x0, domain=STRUCT_DOMAIN, lang=language_c, is_a_field_of_this=0x0) at symtab.c:1889 > #16 0x00000000008e5b30 in lookup_symbol (name=0x7fffffffc890 "cpumask_t", block=0x0, domain=STRUCT_DOMAIN, is_a_field_of_this=0x0) at symtab.c:1915 > #17 0x00000000008f2a4a in gdb_get_datatype (req=0x7fffffffc730) at symtab.c:7229 > #18 0x00000000008f22c0 in gdb_command_funnel_1 (req=0x7fffffffc730) at symtab.c:7027 > #19 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffc730) at symtab.c:6965 > #20 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffc730) at gdb_interface.c:409 > #21 0x00000000005f8a9f in datatype_info (name=name@entry=0xa8454d "cpumask_t", member=member@entry=0x0, dm=dm@entry=0xfffffffffffffffc) at symbols.c:5715 > #22 0x0000000000599947 in cpu_map_size (type=<optimized out>) at kernel.c:913 > #23 0x00000000005a975d in get_cpus_online () at kernel.c:9556 > #24 0x0000000000637a8b in diskdump_get_prstatus_percpu (cpu=16) at diskdump.c:2277 > #25 0x000000000062f0e4 in get_netdump_regs_x86_64 (bt=0x7fffffffd950, ripp=0x7fffffffd130, rspp=0x7fffffffd138) at netdump.c:3471 > #26 0x000000000059fe68 in back_trace (bt=bt@entry=0x7fffffffd950) at kernel.c:3092 > #27 0x00000000005ab1cb in cmd_bt () at kernel.c:2859 > > For the stacktrace of "dis -rl", it calls dw2_expand_all_symtabs() to expand > all symtable of the objfile, or "*.ko.debug" in our case. However for > the stacktrace of "bt", it doesn't expand all, but only a subset of symtable > which is enough to find a symbol by dw2_lookup_symbol(). As a result, the > objfile->compunit_symtabs, which is the head of a single linked list of > struct compunit_symtab, is not NULL but didn't contain all symtables. It > will not be reinitialized in gdb_get_line_number() by "dis -rl" because > !objfile_has_full_symbols(objfile) check will fail, so it cannot display > the proper code line number data. > > This patch will force all the symtable of module to be expanded during > mod load phase, so no matter what commands follow, objfile->compunit_symtabs > always contain all symtabls. Thank you for looking into this issue. a question, is "mod -S -r" a workaround for it? I'm thinking that, if the current gdb's auto expansion is not good for crash, maybe we can make the behavior of "mod -r" option default. The option adds "-readnow" to the add-symbol-file command and it looks same as your patch to me: $ vim gdb-10.2/gdb/symfile.c /* We now have at least a partial symbol table. Check to see if the user requested that all symbols be read on initial access via either the gdb startup command line or on a per symbol file basis. Expand all partial symbol tables for this objfile if so. */ if ((flags & OBJF_READNOW)) { if (should_print) printf_filtered (_("Expanding full symbols from %ps...\n"), styled_string (file_name_style.style (), name)); if (objfile->sf) objfile->sf->qf->expand_all_symtabs (objfile); } Thanks, Kazu > > Signed-off-by: Tao Liu <ltao@xxxxxxxxxx> > --- > > PS: This patch is a stand along and is not the follow-up of > [PATCH v2] symbols: skip load .init.* sections if module was successfully initialized > > --- > gdb-10.2.patch | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/gdb-10.2.patch b/gdb-10.2.patch > index d81030d..0a9a4e1 100644 > --- a/gdb-10.2.patch > +++ b/gdb-10.2.patch > @@ -3187,3 +3187,14 @@ exit 0 > result = stringtab + symbol_entry->_n._n_n._n_offset; > } > else > +--- gdb-10.2/gdb/symtab.c.orig > ++++ gdb-10.2/gdb/symtab.c > +@@ -7537,6 +7537,8 @@ gdb_add_symbol_file(struct gnu_request *req) > + lm->loaded_objfile = objfile->separate_debug_objfile; > + else > + lm->loaded_objfile = objfile; > ++ if (lm->loaded_objfile->sf) > ++ lm->loaded_objfile->sf->qf->expand_all_symtabs(lm->loaded_objfile); > + break; > + } > + } -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s Contribution Guidelines: https://github.com/crash-utility/crash/wiki