On 2022/9/21 10:42, Leizhen (ThunderTown) wrote: > > > On 2022/9/21 1:26, Petr Mladek wrote: >> On Fri 2022-09-09 21:00:09, Zhen Lei wrote: >>> Currently, to search for a symbol, we need to expand the symbols in >>> 'kallsyms_names' one by one, and then use the expanded string for >>> comparison. This is very slow. >>> >>> In fact, we can first compress the name being looked up and then use >>> it for comparison when traversing 'kallsyms_names'. >> >> This does not explain how this patch modifies the compressed data >> and why it is needed. > > Yes, I have updated the description from the v3 version. > > So if we don't compress the symbol type, we can first compress the > searched symbol and then make a quick comparison based on the compressed > length and content. In this way, for entries with mismatched lengths, > there is no need to expand and compare strings. And for those matching > lengths, there's no need to expand the symbol. This saves a lot of time. > >> >> >>> This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86 >>> with defconfig. >>> Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659 >>> After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418 >>> >>> However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names' >>> does not change. >>> >>> Signed-off-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx> >>> --- >>> scripts/kallsyms.c | 15 ++++++++++++--- >>> 1 file changed, 12 insertions(+), 3 deletions(-) >>> >>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c >>> index f18e6dfc68c5839..ab6fe7cd014efd1 100644 >>> --- a/scripts/kallsyms.c >>> +++ b/scripts/kallsyms.c >>> @@ -60,6 +60,7 @@ static unsigned int table_size, table_cnt; >>> static int all_symbols; >>> static int absolute_percpu; >>> static int base_relative; >>> +static int sym_start_idx; >>> >>> static int token_profit[0x10000]; >>> >>> @@ -511,7 +512,7 @@ static void learn_symbol(const unsigned char *symbol, int len) >>> { >>> int i; >>> >>> - for (i = 0; i < len - 1; i++) >>> + for (i = sym_start_idx; i < len - 1; i++) >>> token_profit[ symbol[i] + (symbol[i + 1] << 8) ]++; >> >> This skips the first character in the @symbol string. I do not see how >> this is used in the new code, for example, in >> kallsyms_on_each_match_symbol(), in the 5th patch. It seems to iterate >> the compressed data from the 0th index: >> >> for (i = 0, off = 0; i < kallsyms_num_syms; i++) >> >>> } >>> >>> @@ -520,7 +521,7 @@ static void forget_symbol(const unsigned char *symbol, int len) >>> { >>> int i; >>> >>> - for (i = 0; i < len - 1; i++) >>> + for (i = sym_start_idx; i < len - 1; i++) >>> token_profit[ symbol[i] + (symbol[i + 1] << 8) ]--; >>> } >>> >>> @@ -538,7 +539,7 @@ static unsigned char *find_token(unsigned char *str, int len, >>> { >>> int i; >>> >>> - for (i = 0; i < len - 1; i++) { >>> + for (i = sym_start_idx; i < len - 1; i++) { >>> if (str[i] == token[0] && str[i+1] == token[1]) >>> return &str[i]; >>> } >>> @@ -780,6 +781,14 @@ int main(int argc, char **argv) >>> } else if (argc != 1) >>> usage(); >>> >>> + /* >>> + * Skip the symbol type, do not compress it to optimize the performance >>> + * of finding or traversing symbols in kernel, this is good for modules >>> + * such as livepatch. >> >> I see. The type is added as the first character here. >> >> in static struct sym_entry *read_symbol(FILE *in) >> { >> [...] >> /* include the type field in the symbol name, so that it gets >> * compressed together */ > > Good catch. I should remove "so that it gets compressed together" > >> [...] >> sym->sym[0] = type; >> strcpy(sym_name(sym), name); >> >> It sounds a bit crazy. read_symbol() makes a trick so that the type >> can be compressed. This patch does another trick to avoid it. >> >> >>> + */ >>> + if (all_symbols) >>> + sym_start_idx = 1; >> >> This looks a bit fragile. My understanding is that the new code in >> kernel/kallsyms.c and kernel/module/kallsyms.c depends on this change. > > They do not depend on this change, because the index in > insert_real_symbols_in_table() is still starting from 0. kallsyms_expand_symbol() > shows that it uses every byte of the compressed data to look up the token table. > The index in insert_real_symbols_in_table() starting from 0 make sure that the > raw character of 'type' occupies a separate position in kallsyms_token_table[]. > So that kallsyms_expand_symbol() can still work well. > >> >> The faster search is used when CONFIG_KALLSYMS_ALL is defined. >> But the data are compressed this way when this script is called >> with --all-symbols. >> >> Is it guaranteed that this script will generate the needed data >> when CONFIG_KALLSYMS_ALL is defined? > > Yes, see kallsyms() in scripts/link-vmlinux.sh > if is_enabled CONFIG_KALLSYMS_ALL; then > kallsymopt="${kallsymopt} --all-symbols" > fi > >> >> What about 3rd party modules? > > Should they call the API directly? > >> >> I would personally suggest to store the symbol type into a separate >> sym->type entry in struct sym_entry and never compress it. > > Yes,I've also considered this, for the purpose of increasing the > compression ratio. See below, if the sorting is performed based on > the address and then based on the type. We can record all the symbol > type information in less than 100 bytes. Of course, this makes the > functions that look up symbols based on the address loop serveral > times more. However, I would like to wait until the current patch > series is accepted. Otherwise, I'll have to rework a lot of patches > and it's too much work. To be honest, I've been coding for it these days. The main thing is, I don't know if there will be any other impact after this change. If this doesn't work, I don't have to do it all over again. I can refactor it after it have been fully reviewed. > > cat /proc/kallsyms | awk '{print $2}' | sort | uniq -c | sort -r > 44678 r > 38299 t > 28315 T > 11644 d > 3768 D > 2778 b > 778 R > 641 B > 282 A > 178 W > 37 V > >> >> IMHO, the size win is not worth the code complexity. >> >> Well, people compiling the kernel for small devices might think >> different. But they probably disable kallsyms completely. > > Yes, to make the code look better, I've stopped binding CONFIG_KALLSYMS_ALL since v3. > > 3. The symbol type is not compressed regardless of whether > CONFIG_KALLSYMS_ALL is set or not. The memory overhead is increased > by less than 20KiB if CONFIG_KALLSYMS_ALL=n. > >> >> Best Regards, >> Petr >> . >> > -- Regards, Zhen Lei