On Wed, Sep 6, 2023 at 7:09 PM Alessandro Carminati <alessandro.carminati@xxxxxxxxx> wrote: > > Hello Masahiro, > > Thank you for your suggestions, > Il giorno sab 2 set 2023 alle ore 08:36 Masahiro Yamada > <masahiroy@xxxxxxxxxx> ha scritto: > > > > On Mon, Aug 28, 2023 at 8:45 PM Alessandro Carminati (Red Hat) > > <alessandro.carminati@xxxxxxxxx> wrote: > > > > > > From: Alessandro Carminati <alessandro.carminati@xxxxxxxxx> > > > > > > It is not uncommon for drivers or modules related to similar peripherals > > > to have symbols with the exact same name. > > > While this is not a problem for the kernel's binary itself, it becomes an > > > issue when attempting to trace or probe specific functions using > > > infrastructure like ftrace or kprobe. > > > > > > The tracing subsystem relies on the `nm -n vmlinux` output, which provides > > > symbol information from the kernel's ELF binary. However, when multiple > > > symbols share the same name, the standard nm output does not differentiate > > > between them. This can lead to confusion and difficulty when trying to > > > probe the intended symbol. > > > > > > ~ # cat /proc/kallsyms | grep " name_show" > > > ffffffff8c4f76d0 t name_show > > > ffffffff8c9cccb0 t name_show > > > ffffffff8cb0ac20 t name_show > > > ffffffff8cc728c0 t name_show > > > ffffffff8ce0efd0 t name_show > > > ffffffff8ce126c0 t name_show > > > ffffffff8ce1dd20 t name_show > > > ffffffff8ce24e70 t name_show > > > ffffffff8d1104c0 t name_show > > > ffffffff8d1fe480 t name_show > > > > > > **kas_alias** addresses this challenge by extending the symbol names with > > > unique suffixes during the kernel build process. > > > The newly created aliases for these duplicated symbols are unique names > > > that can be fed to the ftracefs interface. By doing so, it enables > > > previously unreachable symbols to be probed. > > > > > > ~ # cat /proc/kallsyms | grep " name_show" > > > ffffffff974f76d0 t name_show > > > ffffffff974f76d0 t name_show__alias__6340 > > > ffffffff979cccb0 t name_show > > > ffffffff979cccb0 t name_show__alias__6341 > > > ffffffff97b0ac20 t name_show > > > ffffffff97b0ac20 t name_show__alias__6342 > > > ffffffff97c728c0 t name_show > > > ffffffff97c728c0 t name_show__alias__6343 > > > ffffffff97e0efd0 t name_show > > > ffffffff97e0efd0 t name_show__alias__6344 > > > ffffffff97e126c0 t name_show > > > ffffffff97e126c0 t name_show__alias__6345 > > > ffffffff97e1dd20 t name_show > > > ffffffff97e1dd20 t name_show__alias__6346 > > > ffffffff97e24e70 t name_show > > > ffffffff97e24e70 t name_show__alias__6347 > > > ffffffff981104c0 t name_show > > > ffffffff981104c0 t name_show__alias__6348 > > > ffffffff981fe480 t name_show > > > ffffffff981fe480 t name_show__alias__6349 > > > ~ # echo "p:kprobes/evnt1 name_show__alias__6349" \ > > > > >/sys/kernel/tracing/kprobe_events > > > ~ # cat /sys/kernel/tracing/kprobe_events > > > p:kprobes/evnt1 name_show__alias__6349 > > > > > > Changes from v1: > > > - Integrated changes requested by Masami to exclude symbols with prefixes > > > "_cfi" and "_pfx". > > > - Introduced a small framework to handle patterns that need to be excluded > > > from the alias production. > > > - Excluded other symbols using the framework. > > > - Introduced the ability to discriminate between text and data symbols. > > > - Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA, > > > which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which > > > excludes all filters and provides an alias for each duplicated symbol. > > > > > > https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@xxxxxxxxx/ > > > > > > Changes from v2: > > > - Alias tags are created by querying DWARF information from the vmlinux. > > > - The filename + line number is normalized and appended to the original name. > > > - The tag begins with '@' to indicate the symbol source. > > > - Not a change, but worth mentioning, since the alias is added to the existing > > > list, the old duplicated name is preserved, and the livepatch way of dealing > > > with duplicates is maintained. > > > - Acknowledging the existence of scenarios where inlined functions declared in > > > header files may result in multiple copies due to compiler behavior, though > > > it is not actionable as it does not pose an operational issue. > > > - Highlighting a single exception where the same name refers to different > > > functions: the case of "compat_binfmt_elf.c," which directly includes > > > "binfmt_elf.c" producing identical function copies in two separate > > > modules. > > > > > > sample from new v3 > > > > > > ~ # cat /proc/kallsyms | grep gic_mask_irq > > > ffffd0b03c04dae4 t gic_mask_irq > > > ffffd0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167 > > > ffffd0b03c050960 t gic_mask_irq > > > ffffd0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404 > > > ~ # > > > > > > https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@xxxxxxxxx/ > > > > > > Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@xxxxxxxxx> > > > --- > > > init/Kconfig | 36 ++++ > > > scripts/Makefile | 4 + > > > scripts/kas_alias/Makefile | 4 + > > > scripts/kas_alias/a2l.c | 268 ++++++++++++++++++++++++++++ > > > scripts/kas_alias/a2l.h | 32 ++++ > > > scripts/kas_alias/duplicates_list.c | 70 ++++++++ > > > scripts/kas_alias/duplicates_list.h | 15 ++ > > > scripts/kas_alias/item_list.c | 230 ++++++++++++++++++++++++ > > > scripts/kas_alias/item_list.h | 26 +++ > > > scripts/kas_alias/kas_alias.c | 217 ++++++++++++++++++++++ > > > scripts/link-vmlinux.sh | 11 +- > > > 11 files changed, 910 insertions(+), 3 deletions(-) > > > > > > I added some review comments in another thread, but > > one of the biggest concerns might be "910 insertions". > > > > > > What this program does is quite simple, > > "find duplicated names, and call addr2line". > > > > > > > > You wrote a lot of code to self-implement these: > > > > - sort function > > - parse PATH env variable to find addr2line > > - fork addr2line to establish pipe communications > > > > > > > > Have you considered writing the code in Python (or Perl)? > > Is it too slow? > > I have attempted to incorporate all your suggestions. > I refactored the C code to utilize hashing instead of sorting, and I > completely re-implemented the entire thing in Python for the purpose of > comparison. > > You are correct; > the C version is indeed faster, but the difference is negligible when > considering the use case and code maintainability. Nice. Then, I prefer shorter code. The Python implementation is 0.2 sec slower (given the script is executed three times, 0.6 sec cost in total) but it is not a big issue, I think. Thanks. > > Here's a direct comparison of the two. > ``` > ~ $ time ./kas_alias.py -a /usr/bin/aarch64-linux-gnu-addr2line \ > -n linux-6.5/.tmp_vmlinux.kallsyms1.syms \ > -v linux-6.5/.tmp_vmlinux.kallsyms1 \ > -o output_py > > real 0m1.626s > user 0m1.436s > sys 0m0.185s > $ cat kas_alias.py | wc -l > 133 > ~ $ time ./kas_alias -a /usr/bin/aarch64-linux-gnu-addr2line \ > -v linux-6.5/.tmp_vmlinux.kallsyms1 \ > -n linux-6.5/.tmp_vmlinux.kallsyms1.syms \ > -o output_c > > real 0m1.418s > user 0m1.262s > sys 0m0.162s > ~ $ cat a2l.c a2l.h conf.c conf.h item_list.c item_list.h kas_alias.c | wc -l > 742 > ~ $ diff output_py output_c > ~ $ > ``` > C version is 7/10% faster but is more than 5 times in terms of code size. > > > > > Most of the functions you implemented are already > > available in script languages. > > > > > > > > I am not sure if "@<file-path>" is a good solution, > > but the amount of the added code looks too much to me. > > I followed Francis's suggestion and made the separator between > <symbol name> and <normalized filename> an argument that you can select > using the command line. Since I'm not aware of a better choice, I set the > default value to '@'. > > > > > > > > > > > -- > > Best Regards > > Masahiro Yamada > > Best regards > Alessandro Carminati -- Best Regards Masahiro Yamada