Re: [PATCH bpf-next 3/3] selftest/bpf/benchs: add bpf_for_each benchmark

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 18 Nov 2021 11:55:50 -0800

On Thu, Nov 18, 2021 at 3:18 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> Joanne Koong <joannekoong@xxxxxx> writes:
>
> > Add benchmark to measure the overhead of the bpf_for_each call
> > for a specified number of iterations.
> >
> > Testing this on qemu on my dev machine on 1 thread, the data is
> > as follows:
>
> Absolute numbers from some random dev machine are not terribly useful;
> others have no way of replicating your tests. A more meaningful
> benchmark would need a baseline to compare to; in this case I guess that
> would be a regular loop? Do you have any numbers comparing the callback
> to just looping?

Measuring empty for (int i = 0; i < N; i++) is meaningless, you should
expect a number in billions of "operations" per second on modern
server CPUs. So that will give you no idea. Those numbers are useful
as a ballpark number of what's the overhead of bpf_for_each() helper
and callbacks. And 12ns per "iteration" is meaningful to have a good
idea of how slow that can be. Depending on your hardware it can be
different by 2x, maybe 3x, but not 100x.

But measuring inc + cmp + jne as a baseline is both unrealistic and
doesn't give much more extra information. But you can assume 2B/s,
give or take.

And you also can run this benchmark on your own on your hardware to
get "real" numbers, as much as you can expect real numbers from
artificial microbenchmark, of course.

I read those numbers as "plenty fast" :)

>
> -Toke
>