Re: [PATCH bpf-next v4 4/5] bpf/benchs: Add benchmark tests for bloom filter throughput + false positive

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 8 Oct 2021 19:54:05 -0700

On Wed, Oct 6, 2021 at 3:37 PM Joanne Koong <joannekoong@xxxxxx> wrote:
>
> On 10/6/21 3:21 PM, Joanne Koong wrote:
>
> > This patch adds benchmark tests for the throughput (for lookups + updates)
> > and the false positive rate of bloom filter lookups, as well as some
> > minor refactoring of the bash script for running the benchmarks.
> >
> > These benchmarks show that as the number of hash functions increases,
> > the throughput and the false positive rate of the bloom filter decreases.
> >  From the benchmark data, the approximate average false-positive rates for
> > 8-byte values are roughly as follows:
> >
> > 1 hash function = ~30%
> > 2 hash functions = ~15%
> > 3 hash functions = ~5%
> > 4 hash functions = ~2.5%
> > 5 hash functions = ~1%
> > 6 hash functions = ~0.5%
> > 7 hash functions  = ~0.35%
> > 8 hash functions = ~0.15%
> > 9 hash functions = ~0.1%
> > 10 hash functions = ~0%
> >
> > Signed-off-by: Joanne Koong <joannekoong@xxxxxx>
> > ---
> >   tools/testing/selftests/bpf/Makefile          |   6 +-
> >   tools/testing/selftests/bpf/bench.c           |  37 ++
> >   tools/testing/selftests/bpf/bench.h           |   3 +
> >   .../bpf/benchs/bench_bloom_filter_map.c       | 411 ++++++++++++++++++
> >   .../bpf/benchs/run_bench_bloom_filter_map.sh  |  28 ++
> >   .../bpf/benchs/run_bench_ringbufs.sh          |  30 +-
> >   .../selftests/bpf/benchs/run_common.sh        |  48 ++
> >   tools/testing/selftests/bpf/bpf_util.h        |  11 +
> >   .../selftests/bpf/progs/bloom_filter_bench.c  | 146 +++++++
> >   9 files changed, 690 insertions(+), 30 deletions(-)
> >   create mode 100644 tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
> >   create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh
> >   create mode 100644 tools/testing/selftests/bpf/benchs/run_common.sh
> >   create mode 100644 tools/testing/selftests/bpf/progs/bloom_filter_bench.c
> >

[...] (it's a good idea to trim irrelevant parts of email to make it
easier to see relevant parts)

> > +
> > +SEC("fentry/__x64_sys_getpgid")
> > +int prog_bloom_filter_hashmap_lookup(void *ctx)
> > +{
> > +     __u64 *result;
> > +     int i, j, err;
> > +
> > +     __u32 val[64] = {0};
> > +
> > +     for (i = 0; i < 1024; i++) {
> > +             for (j = 0; j < value_sz_nr_u32s && j < 64; j++)
> > +                     val[j] = bpf_get_prandom_u32();
> > +
> I tried out prepopulating these random values from the userspace side
> (where we generate a random sequence of 10000 bytes and put that
> in a bpf array map, then iterate through the bpf array map in the bpf
> program; when I tried implementing it using a global array, I saw
> verifier errors when indexing into the array).
>
> Additionally, this slows down the bench program as well, since we need
> to generate all of these random values in the setup() portion of the
> program.
> I'm not convinced that prepopulating the values ahead of time nets us
> anything - if the concern is that this slows down the bpf program,
> I think this slows down the program in both the hashmap with and without
> bloom filter cases; since we are mainly only interested in the delta between
> these two scenarios, I don't think this ends up mattering that much.

So imagine that a hashmap benchmark takes 10ms per iteration, and
bloom filter + hashmap takes 5ms. That's a 2x difference, right? Now
imagine that random values generation takes another 5ms, so actually
you measure 15ms vs 10ms run time. Now, suddenly, you have measured
just a 1.5x difference.

But it's ok, feel free to just keep the benchmark as is.

>
> > +             if (hashmap_use_bloom_filter) {
> > +                     err = bpf_map_peek_elem(&bloom_filter_map, val);
> > +                     if (err) {
> > +                             if (err != -ENOENT) {
> > +                                     error |= 3;
> > +                                     return 0;
> > +                             }
> > +                             log_result(hit_key);
> > +                             continue;
> > +                     }
> > +             }
> > +
> > +             result = bpf_map_lookup_elem(&hashmap, val);
> > +             if (result) {
> > +                     log_result(hit_key);
> > +             } else {
> > +                     if (hashmap_use_bloom_filter)
> > +                             log_result(false_hit_key);
> > +                     log_result(drop_key);
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}