On Wed, Nov 17, 2021 at 5:07 PM Joanne Koong <joannekoong@xxxxxx> wrote: > > Add benchmark to measure the overhead of the bpf_for_each call > for a specified number of iterations. > > Testing this on qemu on my dev machine on 1 thread, the data is > as follows: > > nr_iterations: 1 > bpf_for_each helper - total callbacks called: 42.949 ± 1.404M/s > > nr_iterations: 10 > bpf_for_each helper - total callbacks called: 73.645 ± 2.077M/s > > nr_iterations: 100 > bpf_for_each helper - total callbacks called: 73.058 ± 1.256M/s > > nr_iterations: 500 > bpf_for_each helper - total callbacks called: 78.255 ± 2.845M/s > > nr_iterations: 1000 > bpf_for_each helper - total callbacks called: 79.439 ± 1.805M/s > > nr_iterations: 5000 > bpf_for_each helper - total callbacks called: 81.639 ± 2.053M/s > > nr_iterations: 10000 > bpf_for_each helper - total callbacks called: 80.577 ± 1.824M/s > > nr_iterations: 50000 > bpf_for_each helper - total callbacks called: 76.773 ± 1.578M/s > > nr_iterations: 100000 > bpf_for_each helper - total callbacks called: 77.073 ± 2.200M/s > > nr_iterations: 500000 > bpf_for_each helper - total callbacks called: 75.136 ± 0.552M/s > > nr_iterations: 1000000 > bpf_for_each helper - total callbacks called: 76.364 ± 1.690M/s bit clear why numbers go down with increased nr_iterations, I'd expect them to stabilize. Try running bench with -a argument to set CPU affinity, that usually improves stability of test results > > From this data, we can see that we are able to run the loop at > least 40 million times per second on an empty callback function. > > From this data, we can also see that as the number of iterations > increases, the overhead per iteration decreases and steadies towards > a constant value. > > Signed-off-by: Joanne Koong <joannekoong@xxxxxx> > --- > tools/testing/selftests/bpf/Makefile | 3 +- > tools/testing/selftests/bpf/bench.c | 4 + > .../selftests/bpf/benchs/bench_for_each.c | 105 ++++++++++++++++++ > .../bpf/benchs/run_bench_for_each.sh | 16 +++ > .../selftests/bpf/progs/for_each_helper.c | 13 +++ $ ls progs/*bench* progs/bloom_filter_bench.c progs/perfbuf_bench.c progs/ringbuf_bench.c progs/trigger_bench.c let's keep the naming pattern > 5 files changed, 140 insertions(+), 1 deletion(-) > create mode 100644 tools/testing/selftests/bpf/benchs/bench_for_each.c > create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_for_each.sh > [...]