This is a v3 of the patchset aiming to speed up parallel reproducible BTF encoding. In comparison to v2: - removed patch v2 03 adding pre_load_module hook - removed patch v2 05 making use of the hook - since we will have a single btf_encoder, there is no need to collect ELF tables before encoders are created - removed patch v2 07 adding btf_encoder_context - patch v3 04 is a rewritten patch v2 06 - each btf_encoder now maintains it's own list of function tables per ELF - patch v3 07 is an updated patch v2 10 - dwarf_loader multithreading is adjusted attempting to minimize blocking on locks - new patch v3 08 increases the cu->obstack chunk size - new patch v3 09 cleans up global list of encoders in btf_encoder.c Testing: - ./tests/tests pass on vmlinux built from bpf-next - bpftool dump of reproducible BTF is identical to v1.28 Sample perf runs on 6.9 kernel with a production-like config, on a machine with nproc=176: This patchset: Performance counter stats for '/home/isolodrai/dwarves/build/pahole -J -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs --lang_exclude=rust --btf_encode_detached=/dev/null .tmp_vmlinux.btf' (13 runs): 17,911.11 msec cpu-clock # 4.412 CPUs utilized ( +- 0.46% ) 4.0600 +- 0.0116 seconds time elapsed ( +- 0.29% ) pahole/next (v1.28): Performance counter stats for '/home/isolodrai/dwarves/build/pahole -J -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs --lang_exclude=rust --btf_encode_detached=/dev/null .tmp_vmlinux.btf' (13 runs): 82,289.12 msec cpu-clock # 17.427 CPUs utilized ( +- 0.54% ) 4.7219 +- 0.0270 seconds time elapsed ( +- 0.57% ) v2: https://lore.kernel.org/dwarves/20241213223641.564002-1-ihor.solodrai@xxxxx/ v1 RFC: https://lore.kernel.org/dwarves/20241128012341.4081072-1-ihor.solodrai@xxxxx/ Alan Maguire (2): btf_encoder: simplify function encoding btf_encoder: separate elf function, saved function representations Ihor Solodrai (6): btf_encoder: introduce elf_functions struct type btf_encoder: introduce elf_functions_list btf_encoder: remove skip_encoding_inconsistent_proto dwarf_loader: introduce cu->id dwarf_loader: multithreading with a job/worker model btf_encoder: clean up global encoders list btf_encoder.c | 643 +++++++++++++++++++----------------- btf_encoder.h | 7 +- btf_loader.c | 2 +- ctf_loader.c | 2 +- dwarf_loader.c | 335 +++++++++++++------ dwarves.c | 44 --- dwarves.h | 21 +- pahole.c | 230 ++----------- pdwtags.c | 3 +- pfunct.c | 3 +- tests/reproducible_build.sh | 5 +- 11 files changed, 605 insertions(+), 690 deletions(-) -- 2.47.1