Hello Masahiro, Thank you for your suggestions, Il giorno sab 2 set 2023 alle ore 08:36 Masahiro Yamada <masahiroy@xxxxxxxxxx> ha scritto: > > On Mon, Aug 28, 2023 at 8:45 PM Alessandro Carminati (Red Hat) > <alessandro.carminati@xxxxxxxxx> wrote: > > > > From: Alessandro Carminati <alessandro.carminati@xxxxxxxxx> > > > > It is not uncommon for drivers or modules related to similar peripherals > > to have symbols with the exact same name. > > While this is not a problem for the kernel's binary itself, it becomes an > > issue when attempting to trace or probe specific functions using > > infrastructure like ftrace or kprobe. > > > > The tracing subsystem relies on the `nm -n vmlinux` output, which provides > > symbol information from the kernel's ELF binary. However, when multiple > > symbols share the same name, the standard nm output does not differentiate > > between them. This can lead to confusion and difficulty when trying to > > probe the intended symbol. > > > > ~ # cat /proc/kallsyms | grep " name_show" > > ffffffff8c4f76d0 t name_show > > ffffffff8c9cccb0 t name_show > > ffffffff8cb0ac20 t name_show > > ffffffff8cc728c0 t name_show > > ffffffff8ce0efd0 t name_show > > ffffffff8ce126c0 t name_show > > ffffffff8ce1dd20 t name_show > > ffffffff8ce24e70 t name_show > > ffffffff8d1104c0 t name_show > > ffffffff8d1fe480 t name_show > > > > **kas_alias** addresses this challenge by extending the symbol names with > > unique suffixes during the kernel build process. > > The newly created aliases for these duplicated symbols are unique names > > that can be fed to the ftracefs interface. By doing so, it enables > > previously unreachable symbols to be probed. > > > > ~ # cat /proc/kallsyms | grep " name_show" > > ffffffff974f76d0 t name_show > > ffffffff974f76d0 t name_show__alias__6340 > > ffffffff979cccb0 t name_show > > ffffffff979cccb0 t name_show__alias__6341 > > ffffffff97b0ac20 t name_show > > ffffffff97b0ac20 t name_show__alias__6342 > > ffffffff97c728c0 t name_show > > ffffffff97c728c0 t name_show__alias__6343 > > ffffffff97e0efd0 t name_show > > ffffffff97e0efd0 t name_show__alias__6344 > > ffffffff97e126c0 t name_show > > ffffffff97e126c0 t name_show__alias__6345 > > ffffffff97e1dd20 t name_show > > ffffffff97e1dd20 t name_show__alias__6346 > > ffffffff97e24e70 t name_show > > ffffffff97e24e70 t name_show__alias__6347 > > ffffffff981104c0 t name_show > > ffffffff981104c0 t name_show__alias__6348 > > ffffffff981fe480 t name_show > > ffffffff981fe480 t name_show__alias__6349 > > ~ # echo "p:kprobes/evnt1 name_show__alias__6349" \ > > > >/sys/kernel/tracing/kprobe_events > > ~ # cat /sys/kernel/tracing/kprobe_events > > p:kprobes/evnt1 name_show__alias__6349 > > > > Changes from v1: > > - Integrated changes requested by Masami to exclude symbols with prefixes > > "_cfi" and "_pfx". > > - Introduced a small framework to handle patterns that need to be excluded > > from the alias production. > > - Excluded other symbols using the framework. > > - Introduced the ability to discriminate between text and data symbols. > > - Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA, > > which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which > > excludes all filters and provides an alias for each duplicated symbol. > > > > https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@xxxxxxxxx/ > > > > Changes from v2: > > - Alias tags are created by querying DWARF information from the vmlinux. > > - The filename + line number is normalized and appended to the original name. > > - The tag begins with '@' to indicate the symbol source. > > - Not a change, but worth mentioning, since the alias is added to the existing > > list, the old duplicated name is preserved, and the livepatch way of dealing > > with duplicates is maintained. > > - Acknowledging the existence of scenarios where inlined functions declared in > > header files may result in multiple copies due to compiler behavior, though > > it is not actionable as it does not pose an operational issue. > > - Highlighting a single exception where the same name refers to different > > functions: the case of "compat_binfmt_elf.c," which directly includes > > "binfmt_elf.c" producing identical function copies in two separate > > modules. > > > > sample from new v3 > > > > ~ # cat /proc/kallsyms | grep gic_mask_irq > > ffffd0b03c04dae4 t gic_mask_irq > > ffffd0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167 > > ffffd0b03c050960 t gic_mask_irq > > ffffd0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404 > > ~ # > > > > https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@xxxxxxxxx/ > > > > Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@xxxxxxxxx> > > --- > > init/Kconfig | 36 ++++ > > scripts/Makefile | 4 + > > scripts/kas_alias/Makefile | 4 + > > scripts/kas_alias/a2l.c | 268 ++++++++++++++++++++++++++++ > > scripts/kas_alias/a2l.h | 32 ++++ > > scripts/kas_alias/duplicates_list.c | 70 ++++++++ > > scripts/kas_alias/duplicates_list.h | 15 ++ > > scripts/kas_alias/item_list.c | 230 ++++++++++++++++++++++++ > > scripts/kas_alias/item_list.h | 26 +++ > > scripts/kas_alias/kas_alias.c | 217 ++++++++++++++++++++++ > > scripts/link-vmlinux.sh | 11 +- > > 11 files changed, 910 insertions(+), 3 deletions(-) > > > I added some review comments in another thread, but > one of the biggest concerns might be "910 insertions". > > > What this program does is quite simple, > "find duplicated names, and call addr2line". > > > > You wrote a lot of code to self-implement these: > > - sort function > - parse PATH env variable to find addr2line > - fork addr2line to establish pipe communications > > > > Have you considered writing the code in Python (or Perl)? > Is it too slow? I have attempted to incorporate all your suggestions. I refactored the C code to utilize hashing instead of sorting, and I completely re-implemented the entire thing in Python for the purpose of comparison. You are correct; the C version is indeed faster, but the difference is negligible when considering the use case and code maintainability. Here's a direct comparison of the two. ``` ~ $ time ./kas_alias.py -a /usr/bin/aarch64-linux-gnu-addr2line \ -n linux-6.5/.tmp_vmlinux.kallsyms1.syms \ -v linux-6.5/.tmp_vmlinux.kallsyms1 \ -o output_py real 0m1.626s user 0m1.436s sys 0m0.185s $ cat kas_alias.py | wc -l 133 ~ $ time ./kas_alias -a /usr/bin/aarch64-linux-gnu-addr2line \ -v linux-6.5/.tmp_vmlinux.kallsyms1 \ -n linux-6.5/.tmp_vmlinux.kallsyms1.syms \ -o output_c real 0m1.418s user 0m1.262s sys 0m0.162s ~ $ cat a2l.c a2l.h conf.c conf.h item_list.c item_list.h kas_alias.c | wc -l 742 ~ $ diff output_py output_c ~ $ ``` C version is 7/10% faster but is more than 5 times in terms of code size. > > Most of the functions you implemented are already > available in script languages. > > > > I am not sure if "@<file-path>" is a good solution, > but the amount of the added code looks too much to me. I followed Francis's suggestion and made the separator between <symbol name> and <normalized filename> an argument that you can select using the command line. Since I'm not aware of a better choice, I set the default value to '@'. > > > > > -- > Best Regards > Masahiro Yamada Best regards Alessandro Carminati