This patchset changes how BTF encoders collect and store function information in a table searcheable by name, where their btf_encoder_function_state is maintained. Previously a separate array of elf_function structs was built and stored by each btf_encoder, and when the encoders were merged before dumping BTF their function tables would be traversed and merged too. With these changes, the table used for searching for elf_function becomes shared between encoder threads. However, additional information in the table and in each individual elf_function has to be stored to accomodate the sharing. The memory footprint is now slightly higher, although it degrades faster when number of threads is increased without this patchset. /usr/bin/time -v with -j8: With patchset: Maximum resident set size (kbytes): 1402792 (~2% higher) Without: Maximum resident set size (kbytes): 1376700 /usr/bin/time -v with -j16: With patchset: Maximum resident set size (kbytes): 1456044 Without: Maximum resident set size (kbytes): 1450748 The idea to make function table shared between encoders has been mentioned previously [1]. I've been looking into ways of improving parallel reproducible_build time, ideally making it as fast as non-reproducible. In an off-list discussion Eduard Zingerman suggested creating a btf_encoder for each input compilation unit. This would allow to enforce the order of BTF merge after btf_encoders are done. With each encoder maintaining it's own table this idea proved infeasible [2]. And so making it shared became a requirement. I implemented a prototype that improved the speed of parallel reproducible_build by using a similar shared table and implementing Eduard's idea [3]. On master (84ab453) reproducible_build was ~50% slower than non-reproducible, and with my experimental changes it was 20% slower. This patchset stems from that work and discussions. The longer term goal is to make BTF encoding reproducible by default, and fast. [1] https://lore.kernel.org/dwarves/20240909155031.1596249-1-alan.maguire@xxxxxxxxxx/ [2] https://lore.kernel.org/dwarves/aQtEzThpIhRiwQpkgeZepPGr5Mt_zuGfPbxQxZgS6SSoPYoqM1afjDqEmIZkdH3YzvhWmWwqCS_8ZvFTcSZHvpkAeBpLRTAZEmrOhq0svfo=@pm.me/ [3] https://github.com/theihor/dwarves/pull/1/files Ihor Solodrai (6): dwarf_loader: introduce pre/post cus__load_module hooks to conf_load btf_encoder: split btf_encoder__collect_symbols() btf_encoder: introduce elf_functions struct type btf_encoder: collect elf_functions in pre_cus__load_module btf_encoder: introduce elf_functions_entry btf_encoder: switch to shared elf_functions table btf_encoder.c | 341 ++++++++++++++++++++++++++++++++++++------------- btf_encoder.h | 5 +- dwarf_loader.c | 21 ++- dwarves.h | 18 ++- pahole.c | 28 ++-- 5 files changed, 303 insertions(+), 110 deletions(-) -- 2.34.1