Re: [RFC/PATCHES 00/12] pahole: Reproducible parallel DWARF loading/serial BTF encoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2024-04-04 at 09:05 +0100, Alan Maguire wrote:
[...]

> Could that be the handling of functions with same name, inconsistent
> prototypes? We leave them out deliberately (see
> btf_encoder__add_saved_funcs().
> 
> > I'll try to figure out the reason for slowdown tomorrow.
> > 
> > [1] https://github.com/eddyz87/dwarves/tree/sort-btf
> > 

Fwiw, the best I can do is here:
https://github.com/eddyz87/dwarves/tree/sort-btf

It adds total ordering to BTF types using kind, kflag, vlen, name etc properties,
and rebuilds final BTF to follow this order. Here are the measurements:

$ sudo cpupower frequency-set --min 3Ghz --max 3Ghz
$ nice -n18 perf stat -r50 pahole --reproducible_build -j --btf_encode_detached=/dev/null vmlinux
           ...
           3.08648 +- 0.00813 seconds time elapsed  ( +-  0.26% )

$ nice -n18 perf stat -r50 pahole -j --btf_encode_detached=/dev/null vmlinux
           ...
           3.00785 +- 0.00878 seconds time elapsed  ( +-  0.29% )

Which gives 2.6% total time penalty when reproducible build option is used.
./test_progs tests are passing.

Still, there are a few discrepancies in generated BTFs: some function
prototypes are included twice at random (about 30 IDs added/deleted).
This might be connected to Alan's suggestion and requires
further investigation.

All in all, Arnaldo's approach with CU ordering looks simpler.

Best regards,
Eduard





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux