Re: Questions on multi-thread for crash

HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@xxxxxxx> · Wed, 15 Feb 2023 04:38:20 +0000

Hi Tao,

On 2023/02/10 18:04, Tao Liu wrote:
> On Fri, Feb 10, 2023 at 12:42:36AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
>> Hi Tao,
>>
>> On 2023/02/08 13:34, Tao Liu wrote:
>>> Hello,
>>>
>>> Recently I made an attempt to introduce a thread pool for crash utility, to
>>> optimize the performance of crash.
>>
>> Thank you for the attempt, interesting.
>> What data made you try to speed up the collection of member offsets?
>> First I'm interested in which routines weigh with crash startup.
>>
>> To be honest, personally I'm fairly satisfied with the current
>> crash-8 startup time :) with commit cd8954023b (thanks to Hatayama-san)
>> and maybe the benefit of GDB's parallel loading, which Andrew said.
>>
>> [root@rhel91u ~]# time echo quit | crash > /dev/null
>>
>> real    0m2.621s
>> user    0m2.574s
>> sys     0m0.112s
>>
>> So I would like to know first whether it's likely to worth looking
>> into the slow part, especially if we have such complexity of
>> multi-threading or another way.
>>
>> Thanks,
>> Kazu
>>
> 
> Hi Kazu,
> 
> The startup time of crash and drgn comparison:
> 
>    $ time echo "quit()" | drgn -c /tmp/maple/vmcore -s /tmp/maple/vmlinux > /dev/null
>    drgn 0.0.22+2.g33c3e36 (using Python 3.10.0, elfutils 0.185, with libkdumpfile)
>    For help, type help(drgn).
>    >>> import drgn
>    >>> from drgn import NULL, Object, cast, container_of, execscript, offsetof, reinterpret, sizeof
>    >>> from drgn.helpers.common import *
>    >>> from drgn.helpers.linux import *
>    warning: could not get debugging information for:
>    kernel modules (could not read depmod: open: /lib/modules/5.19.0-uek+/modules.dep.bin:
>    No such file or directory)
> 
>    real	0m0.222s
>    user	0m1.032s
>    sys	0m0.041s
> 
>    $ time echo quit | crash /tmp/maple/vmcore /tmp/maple/vmlinux > /dev/null
> 
>    real	0m8.266s
>    user	0m7.853s
>    sys	0m0.248s
> 
> As you see, there is a startup time difference between crash and drgn. I haven't
> look into drgn's source code to see if drgn have the same amount of startup work
> as crash. Then I measured the time consuming status of xx_init() functions of
> crash, and gprof the time status:
> 
>    $ crash /tmp/maple/vmcore /tmp/maple/vmlinux
>    gdb_session_init 		23.761000
>    machdep_init POST_RELOC 	0.060000
>    show_untrusted_files 		0.000000
>    kdump_backup_region_init 	0.001000
>    read_in_kernel_config 	1.287000
>    kernel_init	 		1724.044000 <<--- a lot of symbols resolving
>    machdep_init POST_GDB 	851.425000
>    vm_init 			1993.168000 <<--- a lot of symbols resolving
>    machdep_init POST_VM 		414.680000
>    module_init 			287.185000
>    help_init 			0.005000
>    task_init 			459.189000
>    vfs_init 			301.908000
>    net_init 			116.945000
>    dev_init 			6.494000
>    machdep_init POST_INIT 	9.176000
> 
> Each sample counts as 0.01 seconds.
>    %   cumulative   self              self     total
>   time   seconds   seconds    calls  ms/call  ms/call  name
>   14.47      1.12     1.12                             skip_ws(char const*&, char const*&, char const*)
>   14.08      2.21     1.09                             lookup_partial_symbol(objfile*, partial_symtab*, lookup_name_info const&, int, domain_enum_tag)
>   13.44      3.25     1.04                             strncmp_iw_with_mode(char const*, char const*, unsigned long, strncmp_iw_mode, language, completion_match_for_lcd*)
>    5.81      3.70     0.45                             symbol_matches_search_name(general_symbol_info const*, lookup_name_info const&)
>    4.52      4.05     0.35                             gdb::bcache::insert(void const*, int, bool*)
>    4.26      4.38     0.33                             default_symbol_name_matcher(char const*, lookup_name_info const&, completion_match_result*)
>    4.26      4.71     0.33                             language_defn::get_symbol_name_matcher(lookup_name_info const&) const
>    3.23      4.96     0.25                             symbol_matches_domain(language, domain_enum_tag, domain_enum_tag)
>    2.84      5.18     0.22        1   220.00   220.00  symval_hash_init
>    2.20      5.35     0.17                             read_attribute_value(die_reader_specs const*, attribute*, unsigned int, long, unsigned char const*, bool*)
>    2.20      5.52     0.17                             iterative_hash
>    1.81      5.66     0.14                             strcmp_iw_ordered(char const*, char const*)
>    1.81      5.80     0.14                             psym_lookup_symbol(objfile*, block_enum, char const*, domain_enum_tag)
>    1.68      5.93     0.13                             dwarf2_psymtab::readin_p(objfile*) const
>    1.55      6.05     0.12   133543     0.00     0.00  symname_hash_install
> ...
> 
> It looks to me that crash takes a lot of time on symbol resolving, so I thought

Thank you for the information, it looks true also to me.
crash has many checks to support various kernel versions and
configurations, maybe it's a trade-off of the usability..

Many commands assume that offsets and sizes are initialized at
startup time, so it may take much effort to reduce the
initialization of them.  Or lazy (on demand) initialization is
a bit interesting, but not sure whether it will be improved
because of only a few seconds in the first place and some
overhead for the functionality.

sorry, I have no good idea for now.

Thanks,
Kazu

> maybe I can parallel some of those to shorten the startup time.
> 
> Thanks,
> Tao Liu
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>
>>> One obvious point which can benefit from multi-threading is memory.c:vm_init().
>>> There are hundreds of MEMBER_OFFSET_INIT() related symbol resolving functions,
>>> and most of the symbols are independent from each other, by careful arrangement,
>>> they can be invoked parallelly. By doing so, we can shorten the waiting time of
>>> crash starting.
>>>
>>> The implementation is abstracted as the following:
>>>
>>> Before multi-threading:
>>> 	MEMBER_OFFSET_INIT(task_struct_mm, "task_struct", "mm");
>>> 	MEMBER_OFFSET_INIT(mm_struct_mmap, "mm_struct", "mmap");
>>>
>>> After multi-threading:
>>> 	create_threadpool(&pool, 3);
>>> 	...
>>> 	MEMBER_OFFSET_INIT_PARA(pool, task_struct_mm, "task_struct", "mm");
>>> 	MEMBER_OFFSET_INIT_PARA(pool, mm_struct_mmap, "mm_struct", "mmap");
>>> 	...
>>> 	wait_and_destroy_threadpool(pool);
>>>
>>> MEMBER_OFFSET_INIT_PARA just append the task to the work queue of thread pool
>>> and continues, it's up to the pool to schedule the worker thread to do the
>>> symbol resolving work.
>>>
>>> However, after enable multi-threading, I noticed there are always random errors
>>> from gdb. From segfault to broken stack, it seems gdb is not thread safe at
>>> all...
>>>
>>> For example one error listed as follows:
>>>
>>> 	Thread 10 "crash" received signal SIGSEGV, Segmentation fault.
>>> 	[Switching to Thread 0x7fffc4f00640 (LWP 72950)]
>>> 	c_yylex () at /sources/up-crash/gdb-10.2/gdb/c-exp.y:3250
>>> 	3250   if (pstate->language ()->la_language != language_cplus
>>> 	(gdb) bt
>>> 	#0  c_yylex () at /sources/up-crash/gdb-10.2/gdb/c-exp.y:3250
>>> 	#1  c_yyparse () at /sources/up-crash/gdb-10.2/gdb/c-exp.c.tmp:2092
>>> 	#2  0x00000000006f62d7 in c_parse (par_state=<optimized out>) at /sources/
>>>               up-crash/gdb-10.2/gdb/c-exp.y:3414
>>> 	#3  0x0000000000894eac in parse_exp_in_context (stringptr=0x7fffc4efeff8,
>>>               pc=<optimized out>, block=<optimized out>, comma=0, out_subexp=0x0,
>>>               tracker=0x7fffc4efef10, cstate=0x0, void_context_p=0) at parse.c:1122
>>> 	#4  0x00000000008951d6 in parse_exp_1 (tracker=0x0, comma=0, block=0x0,
>>>               pc=0, stringptr=0x7fffc4efeff8) at parse.c:1031
>>> 	#5  parse_expression (string=<optimized out>, string@entry=0x7fffc4eff140
>>>               "slab_s", tracker=tracker@entry=0x0) at parse.c:1166
>>> 	#6  0x000000000092039a in gdb_get_datatype (req=0x7fffc4eff720) at symtab.c:7239
>>> 	#7  gdb_command_funnel_1 (req=0x7fffc4eff720) at symtab.c:7018
>>> 	#8  0x00000000009206de in gdb_command_funnel (req=0x7fffc4eff720) at symtab.c:6956
>>> 	#9  0x00000000005ad137 in gdb_interface (req=0x7fffc4eff720) at gdb_interface.c:409
>>> 	#10 0x00000000005fe76c in datatype_info (name=0xab9700 "slab_s",
>>>               member=0xaba8d8 "list", dm=0x0) at symbols.c:5708
>>> 	#11 0x0000000000517a85 in member_offset_init_slab_s_list_slab_s_list ()
>>>               at memory.c:659
>>> 	#12 0x000000000068168f in group_routine (args=<optimized out>) at thpool.c:81
>>> 	#13 0x00007ffff7a48b17 in start_thread () from /lib64/libc.so.6
>>> 	  #14 0x00007ffff7acd6c0 in clone3 () from /lib64/libc.so.6
>>> 	(gdb) p pstate
>>> 	$1 = (parser_state *) 0x0
>>>
>>> 	$ cat -n /sources/up-crash/gdb-10.2/gdb/c-exp.y
>>> 	66  /* The state of the parser, used internally when we are parsing the
>>> 	67     expression.  */
>>> 	68
>>> 	69  static struct parser_state *pstate = NULL;
>>>
>>> pstate is a global variable and not thread safe, the value must be changed by
>>> someone else...
>>>
>>> Now the project has reached a dead end. Because making gdb thread safe is an
>>> impossible mission to me. Is there any advice or suggestions? Thanks in advance!
>>>
>>> Thanks!
>>> Tao Liu
>>>
>>> --
>>> Crash-utility mailing list
>>> Crash-utility@xxxxxxxxxx
>>> https://listman.redhat.com/mailman/listinfo/crash-utility
>>> Contribution Guidelines: https://github.com/crash-utility/crash/wiki
--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki