On 2/5/24 10:18 AM, Andrii Nakryiko wrote:
On Sun, Feb 4, 2024 at 11:20 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
On 2/2/24 2:18 PM, Andrii Nakryiko wrote:
On Wed, Jan 31, 2024 at 7:56 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote:
This patchset introduces a new generic kfunc bpf_ffs64(). This kfunc
allows bpf to reuse kernel's __ffs64() function to improve ffs
performance in bpf.
The downside of using kfunc for this is that the compiler will assume
that R1-R5 have to be spilled/filled, because that's function call
convention in BPF.
If this was an instruction, though, it would be much more efficient
and would avoid this problem. But I see how something like ffs64 is
useful. I think it would be good to also have popcnt instruction and a
few other fast bit manipulation operations as well.
Perhaps we should think about another BPF ISA extension to add fast
bit manipulation instructions?
Sounds a good idea to start the conversion. Besides popcnt, lzcnt
is also a candidate. From llvm perspective, it would be hard to
generate ffs64/popcnt/lzcnt etc. from source generic implementation.
I'm curious why? I assumed that if a user used __builtin_popcount()
Clang could just generate BPF's popcnt instruction (assuming the right
BPF cpu version is enabled, of course).
Not aware of __builtin_popcount(). Yes, BPF backend should be able easily
converts __builtin_popcount() to a BPF insn.
So most likely, inline asm will be used. libbpf could define
some macros to make adoption easier. Verifier and JIT will do
proper thing, either using corresponding arch insns directly or
verifier will rewrite so JIT won't be aware of these insns.
[...]