Hey Puranjay! :) On Mon, Mar 13, 2023 at 1:56 PM Puranjay Mohan <puranjay12@xxxxxxxxx> wrote: > > [CC: Florent, KP] > > On Mon, Mar 13, 2023 at 7:50 AM Xu Kuohai <xukuohai@xxxxxxxxxx> wrote: > > > > [ cc arm list ] > > > > On 3/10/2023 5:33 PM, Puranjay Mohan wrote: > > > Hi, > > > I am starting this thread to know if someone is implementing the BPF > > > dispatcher for ARM64 and if not, what would be needed to make this > > > happen. As Alexei said, I've been doing some work on ftrace direct calls on arm64 (so the trampolines can get called in tracing programs) https://lore.kernel.org/all/20230207182135.2671106-1-revest@xxxxxxxxxxxx/ It is currently blocked waiting for a review from the ftrace maintainer. Steven has been quite busy but I regularly nag him to review it :) > > > The basic infra + x86 specific code was introduced in [1] by Björn Töpel. > > > > > > To make BPF dispatcher work on ARM64, the > > > arch_prepare_bpf_dispatcher() has to be implemented in > > > arch/arm64/net/bpf_jit_comp.c. > > > > > > As I am not well versed with XDP and the JIT, I have a few questions > > > regarding this. > > > > > > 1. What is the best way to test this? Is there a selftest that will > > > fail now and will pass once the dispatcher is implemented? > > > 2. As there is no CONFIG_RETPOLINE in ARM64, will the dispatcher be useful. > > > > Hello, > > > > I have some thoughts for bpf dispatcher in arm64. > > > > bpf dispatcher uses static call to convert indirect call instructions to direct > > call instructions, to avoid performance penalty introduced by retpoline. Since > > there is no retpoline or static call in arm64, bpf dispatcher seems useless. But I agree with Xu here. The reason why I did not look into bpf dispatchers for arm64 is because there is no retpoline cost on arm64. > > In addition, the range for a direct call instruction in arm64 is +-128MB, but > > jited bpf image address is outside of +-128MB, so it may not be possible to call > > a bpf prog with direct call instruction. > > So, to summarize all the information about BPF Dispatcher on ARM64: > 1. The range for the B and BL instructions in arm64 is +-128MB, so we > can't use direct jump. > 2. Static Calls are not supported on ARM64 yet. > 3. bpf_prog_pack allocator for ARM64 is not yet enabled because > bpf_arch_text_copy() > and bpf_arch_text_invalidate() are not implemented. > > Even if static calls are implemented the dispatcher can't be > implemented because of point 1. And even if they could, I don't see what value they would bring on arm64. > What would be required to implement bpf_arch_text_copy() > and bpf_arch_text_invalidate(). As enabling the bpf_prog_pack > allocator for ARM64 > would be useful in the JIT as well. I have not looked into this at all but ooc have you noticed the series for powerpc sent just a few days back to the list ? https://lore.kernel.org/bpf/20230309180028.180200-1-hbathini@xxxxxxxxxxxxx/