Re: [PATCH bpf-next v3 0/3] bpf: inline bpf_kptr_xchg()

Song Liu <song@xxxxxxxxxx> · Sat, 6 Jan 2024 00:51:04 -0800

On Fri, Jan 5, 2024 at 6:34 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 1/6/2024 6:53 AM, Song Liu wrote:
> > On Fri, Jan 5, 2024 at 2:47 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
> >> From: Hou Tao <houtao1@xxxxxxxxxx>
> >>
> >> Hi,
> >>
> >> The motivation of inlining bpf_kptr_xchg() comes from the performance
> >> profiling of bpf memory allocator benchmark [1]. The benchmark uses
> >> bpf_kptr_xchg() to stash the allocated objects and to pop the stashed
> >> objects for free. After inling bpf_kptr_xchg(), the performance for
> >> object free on 8-CPUs VM increases about 2%~10%. However the performance
> >> gain comes with costs: both the kasan and kcsan checks on the pointer
> >> will be unavailable. Initially the inline is implemented in do_jit() for
> >> x86-64 directly, but I think it will more portable to implement the
> >> inline in verifier.
> > How much work would it take to enable this on other major architectures?
> > AFAICT, most jit compilers already handle BPF_XCHG, so it should be
> > relatively simple?
>
> Yes. I think enabling this inline will be relatively simple. As said in
> patch #1, the inline depends on two conditions:
> 1) atomic_xchg() support on pointer-sized word.
> 2)  the implementation of xchg is the same as atomic_xchg() on
> pointer-sized words.
> For condition 1), I think most major architecture JIT backends have
> support it. So the following work is to check the implementation of xchg
> and atomic_xchg(), to enable the inline and to do more test.

Thanks for the clarification.

> I will try to enable the inline on arm64 first. And will x86-64 + arm64
> be enough for the definition of "major architectures" ? Or Should it
> include riscv, s380, powerpc as well ?

x86_64 + arm64 is "major" enough. :) Maintainers of other JIT engines
can help with other archs.

Thanks,
Song