On Wed, Nov 16, 2022 at 06:10:23PM -0800, Alexei Starovoitov wrote: > On Wed, Nov 16, 2022 at 6:04 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > > On Wed, Nov 16, 2022 at 05:06:19PM -0800, Song Liu wrote: > > > Use execmem_alloc, execmem_free, and execmem_fill instead of > > > bpf_prog_pack_alloc, bpf_prog_pack_free, and bpf_arch_text_copy. > > > > > > execmem_free doesn't require extra size information. Therefore, the free > > > and error handling path can be simplified. > > > > > > There are some tests that show the benefit of execmem_alloc. > > > > > > Run 100 instances of the following benchmark from bpf selftests: > > > tools/testing/selftests/bpf/bench -w2 -d100 -a trig-kprobe > > > which loads 7 BPF programs, and triggers one of them. > > > > > > Then use perf to monitor TLB related counters: > > > perf stat -e iTLB-load-misses,itlb_misses.walk_completed_4k, \ > > > itlb_misses.walk_completed_2m_4m -a > > > > > > The following results are from a qemu VM with 32 cores. > > > > > > Before bpf_prog_pack: > > > iTLB-load-misses: 350k/s > > > itlb_misses.walk_completed_4k: 90k/s > > > itlb_misses.walk_completed_2m_4m: 0.1/s > > > > > > With bpf_prog_pack (current upstream): > > > iTLB-load-misses: 220k/s > > > itlb_misses.walk_completed_4k: 68k/s > > > itlb_misses.walk_completed_2m_4m: 0.2/s > > > > > > With execmem_alloc (with this set): > > > iTLB-load-misses: 185k/s > > > itlb_misses.walk_completed_4k: 58k/s > > > itlb_misses.walk_completed_2m_4m: 1/s > > > > Wonderful. > > > > It would be nice to have this integrated into the bpf selftest, > > > No. Luis please stop suggesting things that don't make sense. > selftest/bpf are not doing performance benchmarking. > We have the 'bench' tool for that. > That's what Song used and it's only running standalone > and not part of any CI. I'm not suggesting to instantiate the VM or crap like that, I'm just asking for the simple script to run 100 instances. This allows folks to reproduce results in an easy way. Whether or not you don't want that for selftests/bpf -- fine, a simple in commit script can easily represent a loop in bash if that's all that was done. Luis