On January 20, 2022 4:59:08 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: >+cc bpf@xxxxxxxxxxxxxxx > >On Wed, Jan 19, 2022 at 5:08 PM Kui-Feng Lee <kuifeng@xxxxxx> wrote: >> >> Creating an instance of btf for each worker thread allows >> steal-function provided by pahole to add type info on multiple threads >> without a lock. The main thread merges the results of worker threads >> to the primary instance. >> >> Copying data from per-thread btf instances to the primary instance is >> expensive now. However, there is a patch landed at the bpf-next >> repository. [1] With the patch for bpf-next and this patch, they drop >> total runtime to 5.4s from 6.0s with "-j4" on my device to generate >> BTF for Linux. > >Just a few more data points. I've tried this locally with 40 cores, >both with and without the libbpf's btf__add_btf() optimization. > >BASELINE NON-PARALLEL >===================== >$ time ./pahole -J ~/linux-build/default/vmlinux >./pahole -J ~/linux-build/default/vmlinux 11.17s user 0.66s system >99% cpu 11.832 total > >BASELINE PARALLEL >================= >$ time ./pahole -j40 -J ~/linux-build/default/vmlinux >./pahole -j40 -J ~/linux-build/default/vmlinux 13.85s user 0.75s >system 290% cpu 5.023 total > >THESE PATCHES WITHOUT LIBBPF SPEED-UP >===================================== >$ time ./pahole -j40 -J ~/linux-build/default/vmlinux >./pahole -j40 -J ~/linux-build/default/vmlinux 25.94s user 1.15s >system 685% cpu 3.954 total > >THESE PATCHES WITH LATEST LIBBPF SPEED-UP >========================================= >$ time ./pahole -j40 -J ~/linux-build/default/vmlinux >./pahole -j40 -J ~/linux-build/default/vmlinux 27.49s user 1.08s >system 858% cpu 3.328 total > > >So on 40 cores, it's a speed up from 11.8 seconds non-parallel, to 5s >parallel without Kui-Feng's changes, to 4s with Kui-Feng's changes, to >3.3s after libbpf update (I did it locally, will sync this to Github >today). > >4x speed up, not bad! That's indeed excellent! From the limited review I've made so far it looks good, takes advantage of the leg work I did, I'll just break down the patches a bit more, look at the review comments from you and repost, I'll not change the patches, just split them a bit more, tomorrow morning. - Arnaldo > >But parallel mode is not currently enabled in kernel build, let's >enable parallel mode and save those seconds during the kernel build! > >> >> [1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=d81283d27266 >> >> Kui-Feng Lee (2): >> dwarf_loader: Prepare and pass per-thread data to worker threads. >> pahole: Use per-thread btf instances to avoid mutex locking. >> >> btf_encoder.c | 5 +++ >> btf_encoder.h | 2 + >> btf_loader.c | 2 +- >> ctf_loader.c | 2 +- >> dwarf_loader.c | 58 ++++++++++++++++++------ >> dwarves.h | 9 +++- >> pahole.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++--- >> pdwtags.c | 3 +- >> pfunct.c | 4 +- >> 9 files changed, 180 insertions(+), 25 deletions(-) >> >> -- >> 2.30.2 >>