Hi Kazu, On Wed, Nov 15, 2023 at 4:21 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@xxxxxxx> wrote: > > On 2023/11/14 17:49, Tao Liu wrote: > > There is an issue that, for kernel modules loaded by mod -s/-S, "dis -rl" fails > > to display module's code line number data after execute "bt" cmd in crash. > > > > Without the patch: > > crsah> mod -S > > crash> bt > > PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" > > #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 > > ...snip... > > #7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e > > [exception RIP: lpfc_nlp_get+210] > > RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046 > > RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc > > RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 > > RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000 > > R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0 > > R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] > > ...snip... > > crash> dis -rl ffffffffc0f60f82 > > 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] > > 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp > > 0xffffffffc0f60eb6 <lpfc_nlp_get+6>: push %rbx > > 0xffffffffc0f60eb7 <lpfc_nlp_get+7>: test %rdi,%rdi > > > > With the patch: > > crash> mod -S > > crash> bt > > PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" > > #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 > > ...snip... > > #7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e > > [exception RIP: lpfc_nlp_get+210] > > RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046 > > RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc > > RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 > > RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000 > > R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0 > > R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] > > ...snip... > > crash> dis -rl ffffffffc0f60f82 > > /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6756 > > 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] > > /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6759 > > 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp > > > > The root cause is, after kernel module been loaded by command mod, the symtable > > is not expanded in gdb side. crash command bt or dis will trigger such an > > expansion. However the symtable expansion is different for the 2 commands: > > > > The stack trace of "dis -rl" for symtable expanding: > > > > #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0xe6a77a0) at symfile.c:2914 > > #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector (this=<optimized out>, static_block=static_block@entry=0xfbe4b60, section=1, expandable=expandable@entry=0) at buildsym.c:1072 > > #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block (this=<optimized out>, static_block=static_block@entry=0xfbe4b60, section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106 > > #3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized out>, cu=0x8ee4c60) at /usr/include/c++/8/bits/unique_ptr.h:716 > > #4 process_queue (per_objfile=0xc54c870) at dwarf2/read.c:9220 > > #5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc54c870, skip_partial=<optimized out>) at dwarf2/read.c:2448 > > #6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd0a320, per_objfile=0xc54c870, skip_partial=<optimized out>) at dwarf2/read.c:2472 > > #7 0x000000000077f75e in dw2_expand_all_symtabs (objfile=<optimized out>) at dwarf2/read.c:3768 > > #8 0x00000000008f254d in gdb_get_line_number (req=0x7fffffffb1f0) at symtab.c:7112 > > #9 0x00000000008f22af in gdb_command_funnel_1 (req=0x7fffffffb1f0) at symtab.c:7023 > > #10 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffb1f0) at symtab.c:6965 > > #11 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffb1f0) at gdb_interface.c:409 > > #12 0x00000000005f5bd8 in get_line_number (addr=18446744072651935408, buf=buf@entry=0x7fffffffd460 "", reserved=reserved@entry=0) at symbols.c:4440 > > #13 0x000000000059e574 in cmd_dis () at kernel.c:2143 > > > > The stack trace of "bt" for symtable expanding: > > > > #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0x1ad15630) at symfile.c:2914 > > #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector (this=<optimized out>, static_block=static_block@entry=0x1db0be30, section=1, expandable=expandable@entry=0) at buildsym.c:1072 > > #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block (this=<optimized out>, static_block=static_block@entry=0x1db0be30, section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106 > > #3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized out>, cu=0x7465240) at /usr/include/c++/8/bits/unique_ptr.h:716 > > #4 process_queue (per_objfile=0xc113810) at dwarf2/read.c:9220 > > #5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc113810, skip_partial=<optimized out>) at dwarf2/read.c:2448 > > #6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd069d0, per_objfile=0xc113810, skip_partial=<optimized out>) at dwarf2/read.c:2472 > > #7 0x000000000077f8ed in dw2_lookup_symbol (objfile=<optimized out>, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at dwarf2/read.c:3669 > > #8 0x00000000008e6d03 in lookup_symbol_via_quick_fns (objfile=0xdd277a0, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2392 > > #9 0x00000000008e7153 in lookup_symbol_in_objfile (objfile=0xdd277a0, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2541 > > #10 0x00000000008e73c6 in lookup_symbol_global_or_static_iterator_cb (objfile=0xdd277a0, cb_data=0x7fffffffc470) at symtab.c:2615 > > #11 0x00000000008b99c4 in svr4_iterate_over_objfiles_in_search_order (gdbarch=<optimized out>, cb=0x8e7342 <lookup_symbol_global_or_static_iterator_cb(objfile*, void*)>, cb_data=0x7fffffffc470, current_objfile=0x0) at solib-svr4.c:3248 > > #12 0x00000000008e754e in lookup_global_or_static_symbol (name=0x7fffffffc890 "cpumask_t", block_index=STATIC_BLOCK, objfile=0x0, domain=STRUCT_DOMAIN) at symtab.c:2660 > > #13 0x00000000008e75da in lookup_static_symbol (name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2678 > > #14 0x00000000008e632c in lookup_symbol_aux (name=0x7fffffffc890 "cpumask_t", match_type=symbol_name_match_type::FULL, block=0x0, domain=STRUCT_DOMAIN, language=language_c, is_a_field_of_this=0x0) at symtab.c:2122 > > #15 0x00000000008e5a7a in lookup_symbol_in_language (name=0x7fffffffc890 "cpumask_t", block=0x0, domain=STRUCT_DOMAIN, lang=language_c, is_a_field_of_this=0x0) at symtab.c:1889 > > #16 0x00000000008e5b30 in lookup_symbol (name=0x7fffffffc890 "cpumask_t", block=0x0, domain=STRUCT_DOMAIN, is_a_field_of_this=0x0) at symtab.c:1915 > > #17 0x00000000008f2a4a in gdb_get_datatype (req=0x7fffffffc730) at symtab.c:7229 > > #18 0x00000000008f22c0 in gdb_command_funnel_1 (req=0x7fffffffc730) at symtab.c:7027 > > #19 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffc730) at symtab.c:6965 > > #20 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffc730) at gdb_interface.c:409 > > #21 0x00000000005f8a9f in datatype_info (name=name@entry=0xa8454d "cpumask_t", member=member@entry=0x0, dm=dm@entry=0xfffffffffffffffc) at symbols.c:5715 > > #22 0x0000000000599947 in cpu_map_size (type=<optimized out>) at kernel.c:913 > > #23 0x00000000005a975d in get_cpus_online () at kernel.c:9556 > > #24 0x0000000000637a8b in diskdump_get_prstatus_percpu (cpu=16) at diskdump.c:2277 > > #25 0x000000000062f0e4 in get_netdump_regs_x86_64 (bt=0x7fffffffd950, ripp=0x7fffffffd130, rspp=0x7fffffffd138) at netdump.c:3471 > > #26 0x000000000059fe68 in back_trace (bt=bt@entry=0x7fffffffd950) at kernel.c:3092 > > #27 0x00000000005ab1cb in cmd_bt () at kernel.c:2859 > > > > For the stacktrace of "dis -rl", it calls dw2_expand_all_symtabs() to expand > > all symtable of the objfile, or "*.ko.debug" in our case. However for > > the stacktrace of "bt", it doesn't expand all, but only a subset of symtable > > which is enough to find a symbol by dw2_lookup_symbol(). As a result, the > > objfile->compunit_symtabs, which is the head of a single linked list of > > struct compunit_symtab, is not NULL but didn't contain all symtables. It > > will not be reinitialized in gdb_get_line_number() by "dis -rl" because > > !objfile_has_full_symbols(objfile) check will fail, so it cannot display > > the proper code line number data. > > > > This patch will force all the symtable of module to be expanded during > > mod load phase, so no matter what commands follow, objfile->compunit_symtabs > > always contain all symtabls. > > Thank you for looking into this issue. > > a question, is "mod -S -r" a workaround for it? > Yes, it can work as expected with "mod -S -r", I didn't know "-r" parameter can trigger such symble expansion. > I'm thinking that, if the current gdb's auto expansion is not good for > crash, maybe we can make the behavior of "mod -r" option default. The > option adds "-readnow" to the add-symbol-file command and it looks same > as your patch to me: > > $ vim gdb-10.2/gdb/symfile.c > > /* We now have at least a partial symbol table. Check to see if the > user requested that all symbols be read on initial access via either > the gdb startup command line or on a per symbol file basis. Expand > all partial symbol tables for this objfile if so. */ > > if ((flags & OBJF_READNOW)) > { > if (should_print) > printf_filtered (_("Expanding full symbols from %ps...\n"), > styled_string (file_name_style.style (), name)); > > if (objfile->sf) > objfile->sf->qf->expand_all_symtabs (objfile); > } > Agreed, they do the same work. Thanks again for your suggestions. Do you want me to draft the "making mod -r option default" patch now or later? Thanks, Tao Liu > > Thanks, > Kazu > > > > > Signed-off-by: Tao Liu <ltao@xxxxxxxxxx> > > --- > > > > PS: This patch is a stand along and is not the follow-up of > > [PATCH v2] symbols: skip load .init.* sections if module was successfully initialized > > > > --- > > gdb-10.2.patch | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/gdb-10.2.patch b/gdb-10.2.patch > > index d81030d..0a9a4e1 100644 > > --- a/gdb-10.2.patch > > +++ b/gdb-10.2.patch > > @@ -3187,3 +3187,14 @@ exit 0 > > result = stringtab + symbol_entry->_n._n_n._n_offset; > > } > > else > > +--- gdb-10.2/gdb/symtab.c.orig > > ++++ gdb-10.2/gdb/symtab.c > > +@@ -7537,6 +7537,8 @@ gdb_add_symbol_file(struct gnu_request *req) > > + lm->loaded_objfile = objfile->separate_debug_objfile; > > + else > > + lm->loaded_objfile = objfile; > > ++ if (lm->loaded_objfile->sf) > > ++ lm->loaded_objfile->sf->qf->expand_all_symtabs(lm->loaded_objfile); > > + break; > > + } > > + } -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s Contribution Guidelines: https://github.com/crash-utility/crash/wiki