On Fri Jun 21, 2024 at 5:09 AM AEST, Naveen N Rao wrote: > Add support for bpf_arch_text_poke() and arch_prepare_bpf_trampoline() > for 64-bit powerpc. What do BPF trampolines give you? > BPF prog JIT is extended to mimic 64-bit powerpc approach for ftrace > having a single nop at function entry, followed by the function > profiling sequence out-of-line and a separate long branch stub for calls > to trampolines that are out of range. A dummy_tramp is provided to > simplify synchronization similar to arm64. Synrhonization - between BPF and ftrace interfaces? > BPF Trampolines adhere to the existing ftrace ABI utilizing a > two-instruction profiling sequence, as well as the newer ABI utilizing a > three-instruction profiling sequence enabling return with a 'blr'. The > trampoline code itself closely follows x86 implementation. > > While the code is generic, BPF trampolines are only enabled on 64-bit > powerpc. 32-bit powerpc will need testing and some updates. > > Signed-off-by: Naveen N Rao <naveen@xxxxxxxxxx> Just a quick glance for now, and I don't know BPF code much. > --- > arch/powerpc/include/asm/ppc-opcode.h | 14 + > arch/powerpc/net/bpf_jit.h | 11 + > arch/powerpc/net/bpf_jit_comp.c | 702 +++++++++++++++++++++++++- > arch/powerpc/net/bpf_jit_comp32.c | 7 +- > arch/powerpc/net/bpf_jit_comp64.c | 7 +- > 5 files changed, 738 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h > index 076ae60b4a55..9eaa2c5d9b73 100644 > --- a/arch/powerpc/include/asm/ppc-opcode.h > +++ b/arch/powerpc/include/asm/ppc-opcode.h > @@ -585,12 +585,26 @@ > #define PPC_RAW_MTSPR(spr, d) (0x7c0003a6 | ___PPC_RS(d) | __PPC_SPR(spr)) > #define PPC_RAW_EIEIO() (0x7c0006ac) > > +/* bcl 20,31,$+4 */ > +#define PPC_RAW_BCL() (0x429f0005) This is the special bcl form that gives the current address. Maybe call it PPC_RAW_BCL4() > > +void dummy_tramp(void); > + > +asm ( > +" .pushsection .text, \"ax\", @progbits ;" > +" .global dummy_tramp ;" > +" .type dummy_tramp, @function ;" > +"dummy_tramp: ;" > +#ifdef CONFIG_FTRACE_PFE_OUT_OF_LINE > +" blr ;" > +#else > +" mflr 11 ;" Can you just drop this instruction? The caller will always have it in r11? > +" mtctr 11 ;" > +" mtlr 0 ;" > +" bctr ;" > +#endif > +" .size dummy_tramp, .-dummy_tramp ;" > +" .popsection ;" > +); > + > +void bpf_jit_build_fentry_stubs(u32 *image, struct codegen_context *ctx) > +{ > + int ool_stub_idx, long_branch_stub_idx; > + > + /* > + * Out-of-line stub: > + * mflr r0 > + * [b|bl] tramp > + * mtlr r0 // only with CONFIG_FTRACE_PFE_OUT_OF_LINE > + * b bpf_func + 4 > + */ > + ool_stub_idx = ctx->idx; > + EMIT(PPC_RAW_MFLR(_R0)); > + EMIT(PPC_RAW_NOP()); > + if (IS_ENABLED(CONFIG_FTRACE_PFE_OUT_OF_LINE)) > + EMIT(PPC_RAW_MTLR(_R0)); > + WARN_ON_ONCE(!is_offset_in_branch_range(4 - (long)ctx->idx * 4)); /* TODO */ > + EMIT(PPC_RAW_BRANCH(4 - (long)ctx->idx * 4)); > + > + /* > + * Long branch stub: > + * .long <dummy_tramp_addr> > + * mflr r11 > + * bcl 20,31,$+4 > + * mflr r12 > + * ld r12, -8-SZL(r12) > + * mtctr r12 > + * mtlr r11 // needed to retain ftrace ABI > + * bctr > + */ You could avoid clobbering LR on >= POWER9 with addpcis instruction. Or use a pcrel load with pcrel even. I guess that's something to do later. Thanks, Nick