Re: [PATCH dwarves 0/6] btf_encoder: implement shared elf_functions table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/10/2024 06:13, Ihor Solodrai wrote:
> This patchset changes how BTF encoders collect and store function
> information in a table searcheable by name, where their
> btf_encoder_function_state is maintained.
>

hi Ihor

I'll take a more detailed look shortly, but I'd suggest respinning the
series against the updated "next" branch shortly as there are changes in
flight that Stephen is working on to encode variables; they
significantly change the processing of variables, see [1] for details.
They have partly landed and the rest will likely land soon. It might
make things easier to wait until those changes land and respin, and I
don't think they invalidate the approach you suggest, just will require
some merging around variable handling I think.

[1] https://lore.kernel.org/dwarves/8734ldm72u.fsf@xxxxxxxxxx/


> Previously a separate array of elf_function structs was built and
> stored by each btf_encoder, and when the encoders were merged before
> dumping BTF their function tables would be traversed and merged too.
> 
> With these changes, the table used for searching for elf_function
> becomes shared between encoder threads. However, additional
> information in the table and in each individual elf_function has to be
> stored to accomodate the sharing.
> 
> The memory footprint is now slightly higher, although it degrades
> faster when number of threads is increased without this patchset.
> 
> /usr/bin/time -v with -j8:
> 
>     With patchset: Maximum resident set size (kbytes): 1402792 (~2% higher)
>     Without:       Maximum resident set size (kbytes): 1376700
> 
> /usr/bin/time -v with -j16:
> 
>     With patchset: Maximum resident set size (kbytes): 1456044
>     Without:       Maximum resident set size (kbytes): 1450748
> 
> The idea to make function table shared between encoders has been
> mentioned previously [1].
> 
> I've been looking into ways of improving parallel reproducible_build
> time, ideally making it as fast as non-reproducible.  In an off-list
> discussion Eduard Zingerman suggested creating a btf_encoder for each
> input compilation unit. This would allow to enforce the order of BTF
> merge after btf_encoders are done.
> 
> With each encoder maintaining it's own table this idea proved
> infeasible [2]. And so making it shared became a requirement.
> 
> I implemented a prototype that improved the speed of parallel
> reproducible_build by using a similar shared table and implementing
> Eduard's idea [3]. On master (84ab453) reproducible_build was ~50%
> slower than non-reproducible, and with my experimental changes it was
> 20% slower.
> 
> This patchset stems from that work and discussions. The longer term
> goal is to make BTF encoding reproducible by default, and fast.
> 
> [1] https://lore.kernel.org/dwarves/20240909155031.1596249-1-alan.maguire@xxxxxxxxxx/
> [2] https://lore.kernel.org/dwarves/aQtEzThpIhRiwQpkgeZepPGr5Mt_zuGfPbxQxZgS6SSoPYoqM1afjDqEmIZkdH3YzvhWmWwqCS_8ZvFTcSZHvpkAeBpLRTAZEmrOhq0svfo=@pm.me/
> [3] https://github.com/theihor/dwarves/pull/1/files
> 
> Ihor Solodrai (6):
>   dwarf_loader: introduce pre/post cus__load_module hooks to conf_load
>   btf_encoder: split btf_encoder__collect_symbols()
>   btf_encoder: introduce elf_functions struct type
>   btf_encoder: collect elf_functions in pre_cus__load_module
>   btf_encoder: introduce elf_functions_entry
>   btf_encoder: switch to shared elf_functions table
> 
>  btf_encoder.c  | 341 ++++++++++++++++++++++++++++++++++++-------------
>  btf_encoder.h  |   5 +-
>  dwarf_loader.c |  21 ++-
>  dwarves.h      |  18 ++-
>  pahole.c       |  28 ++--
>  5 files changed, 303 insertions(+), 110 deletions(-)
> 





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux