Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> writes: > On Thu, Nov 18, 2021 at 3:18 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> >> Joanne Koong <joannekoong@xxxxxx> writes: >> >> > Add benchmark to measure the overhead of the bpf_for_each call >> > for a specified number of iterations. >> > >> > Testing this on qemu on my dev machine on 1 thread, the data is >> > as follows: >> >> Absolute numbers from some random dev machine are not terribly useful; >> others have no way of replicating your tests. A more meaningful >> benchmark would need a baseline to compare to; in this case I guess that >> would be a regular loop? Do you have any numbers comparing the callback >> to just looping? > > Measuring empty for (int i = 0; i < N; i++) is meaningless, you should > expect a number in billions of "operations" per second on modern > server CPUs. So that will give you no idea. Those numbers are useful > as a ballpark number of what's the overhead of bpf_for_each() helper > and callbacks. And 12ns per "iteration" is meaningful to have a good > idea of how slow that can be. Depending on your hardware it can be > different by 2x, maybe 3x, but not 100x. > > But measuring inc + cmp + jne as a baseline is both unrealistic and > doesn't give much more extra information. But you can assume 2B/s, > give or take. > > And you also can run this benchmark on your own on your hardware to > get "real" numbers, as much as you can expect real numbers from > artificial microbenchmark, of course. > > > I read those numbers as "plenty fast" :) Hmm, okay, fair enough, but I think it would be good to have the "~12 ns per iteration" figure featured prominently in the commit message, then :) -Toke