From: Hou Tao <houtao1@xxxxxxxxxx> Hi, The motivation for the patch set comes from the performance profiling of bpf memory allocator benchmark (will post it soon). The initial purpose of the benchmark is used to test whether or not there is performance degradation when using c->unit_size instead of ksize() to select the target cache for free [1]. The benchmark uses bpf_kptr_xchg() to stash the allocated objects and fetches the stashed objects for free. Based on the fix proposed in [1], After inling bpf_kptr_xchg(), the performance for object free increase about ~4%. Initially the inline is implemented in do_jit() for x86-64 directly, but I think it will more portable to implement the inline in verifier. Please see individual patches for more details. And comments are always welcome. [1]: https://lore.kernel.org/bpf/20231216131052.27621-1-houtao@xxxxxxxxxxxxxxx Hou Tao (3): bpf: Support inlining bpf_kptr_xchg() helper bpf, x86: Don't generate lock prefix for BPF_XCHG bpf, x86: Inline bpf_kptr_xchg() on x86-64 arch/x86/net/bpf_jit_comp.c | 9 ++++++++- include/linux/filter.h | 1 + kernel/bpf/core.c | 10 ++++++++++ kernel/bpf/verifier.c | 17 +++++++++++++++++ 4 files changed, 36 insertions(+), 1 deletion(-) -- 2.29.2