On 1/6/2024 6:53 AM, Song Liu wrote: > On Fri, Jan 5, 2024 at 2:47 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >> From: Hou Tao <houtao1@xxxxxxxxxx> >> >> Hi, >> >> The motivation of inlining bpf_kptr_xchg() comes from the performance >> profiling of bpf memory allocator benchmark [1]. The benchmark uses >> bpf_kptr_xchg() to stash the allocated objects and to pop the stashed >> objects for free. After inling bpf_kptr_xchg(), the performance for >> object free on 8-CPUs VM increases about 2%~10%. However the performance >> gain comes with costs: both the kasan and kcsan checks on the pointer >> will be unavailable. Initially the inline is implemented in do_jit() for >> x86-64 directly, but I think it will more portable to implement the >> inline in verifier. > How much work would it take to enable this on other major architectures? > AFAICT, most jit compilers already handle BPF_XCHG, so it should be > relatively simple? Yes. I think enabling this inline will be relatively simple. As said in patch #1, the inline depends on two conditions: 1) atomic_xchg() support on pointer-sized word. 2) the implementation of xchg is the same as atomic_xchg() on pointer-sized words. For condition 1), I think most major architecture JIT backends have support it. So the following work is to check the implementation of xchg and atomic_xchg(), to enable the inline and to do more test. I will try to enable the inline on arm64 first. And will x86-64 + arm64 be enough for the definition of "major architectures" ? Or Should it include riscv, s380, powerpc as well ? > Other than this, for the set > > Acked-by: Song Liu <song@xxxxxxxxxx> Thanks for the ack. > > Thanks, > Song > .