Hi, The motivation for introducing bpf_strncmp() helper comes from two aspects: (1) clang doesn't always replace strncmp() automatically In tracing program, sometimes we need to using a home-made strncmp() to check whether or not the file name is expected. (2) the performance of home-made strncmp is not so good As shown in the benchmark in patch #4, the performance of bpf_strncmp() helper is 18% or 33% better than home-made strncmp() under x86-64 or arm64 when the compared string length is 64. When the string length grows to 4095, the performance win will be 179% or 600% under x86-64 or arm64. The prototype of bpf_strncmp() has changed from bpf_strncmp(const char *s1, const char *s2, u32 s2_sz) to bpf_strncmp(const char *s1, u32 s1_sz, const char *s2) The main reason is readability and there is nearly no performance difference between these two APIs (refer to the data attached below [1]). Any comments are welcome. Regards, Tao Change Log: v1: * change API to bpf_strncmp(const char *s1, u32 s1_sz, const char *s2) * add benchmark refactor and benchmark between bpf_strncmp() and strncmp() RFC: https://lore.kernel.org/bpf/20211106132822.1396621-1-houtao1@xxxxxxxxxx/ [1] Performance difference between two APIs under x86-64: helper_rfc-X: use bpf_strncmp in RFC to compare X-sized string helper-Y: use bpf_strncmp in v1 to compare Y-sized string helper_rfc-1 3.482 ± 0.002M/s (drops 0.000 ± 0.000M/s) helper-1 3.485 ± 0.001M/s (drops 0.000 ± 0.000M/s) helper_rfc-8 3.428 ± 0.001M/s (drops 0.000 ± 0.000M/s) helper-8 3.434 ± 0.001M/s (drops 0.000 ± 0.000M/s) helper_rfc-32 3.253 ± 0.002M/s (drops 0.000 ± 0.000M/s) helper-32 3.234 ± 0.001M/s (drops 0.000 ± 0.000M/s) helper_rfc-64 3.039 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper-64 3.042 ± 0.001M/s (drops 0.000 ± 0.000M/s) helper_rfc-128 2.640 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper-128 2.633 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper_rfc-512 1.576 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper-512 1.574 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper_rfc-2048 0.602 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper-2048 0.602 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper_rfc-4095 0.328 ± 0.000M/s (drops 0.000 ± 0.000M/s) helper-4095 0.328 ± 0.000M/s (drops 0.000 ± 0.000M/s) Hou Tao (5): bpf: add bpf_strncmp helper selftests/bpf: fix checkpatch error on empty function parameter selftests/bpf: factor out common helpers for benchmarks selftests/bpf: add benchmark for bpf_strncmp() helper selftests/bpf: add test cases for bpf_strncmp() include/linux/bpf.h | 1 + include/uapi/linux/bpf.h | 11 ++ kernel/bpf/helpers.c | 16 ++ tools/include/uapi/linux/bpf.h | 11 ++ tools/testing/selftests/bpf/Makefile | 4 +- tools/testing/selftests/bpf/bench.c | 21 ++- tools/testing/selftests/bpf/bench.h | 34 +++- .../bpf/benchs/bench_bloom_filter_map.c | 44 ++--- .../selftests/bpf/benchs/bench_count.c | 16 +- .../selftests/bpf/benchs/bench_rename.c | 43 ++--- .../selftests/bpf/benchs/bench_ringbufs.c | 21 +-- .../selftests/bpf/benchs/bench_strncmp.c | 150 ++++++++++++++++ .../selftests/bpf/benchs/bench_trigger.c | 79 ++++---- .../selftests/bpf/benchs/run_bench_strncmp.sh | 12 ++ .../selftests/bpf/prog_tests/test_strncmp.c | 170 ++++++++++++++++++ .../selftests/bpf/progs/strncmp_bench.c | 50 ++++++ .../selftests/bpf/progs/strncmp_test.c | 59 ++++++ 17 files changed, 604 insertions(+), 138 deletions(-) create mode 100644 tools/testing/selftests/bpf/benchs/bench_strncmp.c create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_strncmp.sh create mode 100644 tools/testing/selftests/bpf/prog_tests/test_strncmp.c create mode 100644 tools/testing/selftests/bpf/progs/strncmp_bench.c create mode 100644 tools/testing/selftests/bpf/progs/strncmp_test.c -- 2.29.2