On Tue, Nov 30, 2021 at 6:07 AM Hou Tao <houtao1@xxxxxxxxxx> wrote: > > Add benchmark to compare the performance between home-made strncmp() > in bpf program and bpf_strncmp() helper. In summary, the performance > win of bpf_strncmp() under x86-64 is greater than 18% when the compared > string length is greater than 64, and is 179% when the length is 4095. > Under arm64 the performance win is even bigger: 33% when the length > is greater than 64 and 600% when the length is 4095. > > The following is the details: > > no-helper-X: use home-made strncmp() to compare X-sized string > helper-Y: use bpf_strncmp() to compare Y-sized string > > Under x86-64: > > no-helper-1 3.504 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-1 3.347 ± 0.001M/s (drops 0.000 ± 0.000M/s) > > no-helper-8 3.357 ± 0.001M/s (drops 0.000 ± 0.000M/s) > helper-8 3.307 ± 0.001M/s (drops 0.000 ± 0.000M/s) > > no-helper-32 3.064 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-32 3.253 ± 0.001M/s (drops 0.000 ± 0.000M/s) > > no-helper-64 2.563 ± 0.001M/s (drops 0.000 ± 0.000M/s) > helper-64 3.040 ± 0.001M/s (drops 0.000 ± 0.000M/s) > > no-helper-128 1.975 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-128 2.641 ± 0.000M/s (drops 0.000 ± 0.000M/s) > > no-helper-512 0.759 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-512 1.574 ± 0.000M/s (drops 0.000 ± 0.000M/s) > > no-helper-2048 0.329 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-2048 0.602 ± 0.000M/s (drops 0.000 ± 0.000M/s) > > no-helper-4095 0.117 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-4095 0.327 ± 0.000M/s (drops 0.000 ± 0.000M/s) > > Under arm64: > > no-helper-1 2.806 ± 0.004M/s (drops 0.000 ± 0.000M/s) > helper-1 2.819 ± 0.002M/s (drops 0.000 ± 0.000M/s) > > no-helper-8 2.797 ± 0.109M/s (drops 0.000 ± 0.000M/s) > helper-8 2.786 ± 0.025M/s (drops 0.000 ± 0.000M/s) > > no-helper-32 2.399 ± 0.011M/s (drops 0.000 ± 0.000M/s) > helper-32 2.703 ± 0.002M/s (drops 0.000 ± 0.000M/s) > > no-helper-64 2.020 ± 0.015M/s (drops 0.000 ± 0.000M/s) > helper-64 2.702 ± 0.073M/s (drops 0.000 ± 0.000M/s) > > no-helper-128 1.604 ± 0.001M/s (drops 0.000 ± 0.000M/s) > helper-128 2.516 ± 0.002M/s (drops 0.000 ± 0.000M/s) > > no-helper-512 0.699 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-512 2.106 ± 0.003M/s (drops 0.000 ± 0.000M/s) > > no-helper-2048 0.215 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-2048 1.223 ± 0.003M/s (drops 0.000 ± 0.000M/s) > > no-helper-4095 0.112 ± 0.000M/s (drops 0.000 ± 0.000M/s) > helper-4095 0.796 ± 0.000M/s (drops 0.000 ± 0.000M/s) > > Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> > --- > tools/testing/selftests/bpf/Makefile | 4 +- > tools/testing/selftests/bpf/bench.c | 6 + > .../selftests/bpf/benchs/bench_strncmp.c | 150 ++++++++++++++++++ > .../selftests/bpf/benchs/run_bench_strncmp.sh | 12 ++ > .../selftests/bpf/progs/strncmp_bench.c | 50 ++++++ > 5 files changed, 221 insertions(+), 1 deletion(-) > create mode 100644 tools/testing/selftests/bpf/benchs/bench_strncmp.c > create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_strncmp.sh > create mode 100644 tools/testing/selftests/bpf/progs/strncmp_bench.c > [...] > diff --git a/tools/testing/selftests/bpf/progs/strncmp_bench.c b/tools/testing/selftests/bpf/progs/strncmp_bench.c > new file mode 100644 > index 000000000000..18373a7df76e > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/strncmp_bench.c > @@ -0,0 +1,50 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */ > +#include <linux/types.h> > +#include <linux/bpf.h> > +#include <bpf/bpf_helpers.h> > +#include <bpf/bpf_tracing.h> > + > +#define STRNCMP_STR_SZ 4096 > + > +/* Will be updated by benchmark before program loading */ > +const volatile unsigned int cmp_str_len = 1; > +const char target[STRNCMP_STR_SZ]; > + > +long hits = 0; > +char str[STRNCMP_STR_SZ]; > + > +char _license[] SEC("license") = "GPL"; > + > +static __always_inline int local_strncmp(const char *s1, unsigned int sz, > + const char *s2) > +{ > + int ret = 0; > + unsigned int i; > + > + for (i = 0; i < sz; i++) { > + /* E.g. 0xff > 0x31 */ > + ret = (unsigned char)s1[i] - (unsigned char)s2[i]; I'm actually not sure if it will perform subtraction in unsigned form (and thus you'll never have a negative result) and then cast to int, or not. Why not cast to int instead of unsigned char to be sure? > + if (ret || !s1[i]) > + break; > + } > + > + return ret; > +} > + > +SEC("tp/syscalls/sys_enter_getpgid") > +int strncmp_no_helper(void *ctx) > +{ > + if (local_strncmp(str, cmp_str_len + 1, target) < 0) > + __sync_add_and_fetch(&hits, 1); > + return 0; > +} > + > +SEC("tp/syscalls/sys_enter_getpgid") > +int strncmp_helper(void *ctx) > +{ > + if (bpf_strncmp(str, cmp_str_len + 1, target) < 0) > + __sync_add_and_fetch(&hits, 1); > + return 0; > +} > + > -- > 2.29.2 >