Re: [RFC PATCH v3 11/11] powerpc64/bpf: Add support for bpf trampolines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri Jun 21, 2024 at 5:09 AM AEST, Naveen N Rao wrote:
> Add support for bpf_arch_text_poke() and arch_prepare_bpf_trampoline()
> for 64-bit powerpc.

What do BPF trampolines give you?

> BPF prog JIT is extended to mimic 64-bit powerpc approach for ftrace
> having a single nop at function entry, followed by the function
> profiling sequence out-of-line and a separate long branch stub for calls
> to trampolines that are out of range. A dummy_tramp is provided to
> simplify synchronization similar to arm64.

Synrhonization - between BPF and ftrace interfaces?

> BPF Trampolines adhere to the existing ftrace ABI utilizing a
> two-instruction profiling sequence, as well as the newer ABI utilizing a
> three-instruction profiling sequence enabling return with a 'blr'. The
> trampoline code itself closely follows x86 implementation.
>
> While the code is generic, BPF trampolines are only enabled on 64-bit
> powerpc. 32-bit powerpc will need testing and some updates.
>
> Signed-off-by: Naveen N Rao <naveen@xxxxxxxxxx>

Just a quick glance for now, and I don't know BPF code much.

> ---
>  arch/powerpc/include/asm/ppc-opcode.h |  14 +
>  arch/powerpc/net/bpf_jit.h            |  11 +
>  arch/powerpc/net/bpf_jit_comp.c       | 702 +++++++++++++++++++++++++-
>  arch/powerpc/net/bpf_jit_comp32.c     |   7 +-
>  arch/powerpc/net/bpf_jit_comp64.c     |   7 +-
>  5 files changed, 738 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> index 076ae60b4a55..9eaa2c5d9b73 100644
> --- a/arch/powerpc/include/asm/ppc-opcode.h
> +++ b/arch/powerpc/include/asm/ppc-opcode.h
> @@ -585,12 +585,26 @@
>  #define PPC_RAW_MTSPR(spr, d)		(0x7c0003a6 | ___PPC_RS(d) | __PPC_SPR(spr))
>  #define PPC_RAW_EIEIO()			(0x7c0006ac)
>  
> +/* bcl 20,31,$+4 */
> +#define PPC_RAW_BCL()			(0x429f0005)

This is the special bcl form that gives the current address.
Maybe call it PPC_RAW_BCL4()

>  
> +void dummy_tramp(void);
> +
> +asm (
> +"	.pushsection .text, \"ax\", @progbits	;"
> +"	.global dummy_tramp			;"
> +"	.type dummy_tramp, @function		;"
> +"dummy_tramp:					;"
> +#ifdef CONFIG_FTRACE_PFE_OUT_OF_LINE
> +"	blr					;"
> +#else
> +"	mflr	11				;"

Can you just drop this instruction? The caller will always
have it in r11?

> +"	mtctr	11				;"
> +"	mtlr	0				;"
> +"	bctr					;"
> +#endif
> +"	.size dummy_tramp, .-dummy_tramp	;"
> +"	.popsection				;"
> +);
> +
> +void bpf_jit_build_fentry_stubs(u32 *image, struct codegen_context *ctx)
> +{
> +	int ool_stub_idx, long_branch_stub_idx;
> +
> +	/*
> +	 * Out-of-line stub:
> +	 *	mflr	r0
> +	 *	[b|bl]	tramp
> +	 *	mtlr	r0 // only with CONFIG_FTRACE_PFE_OUT_OF_LINE
> +	 *	b	bpf_func + 4
> +	 */
> +	ool_stub_idx = ctx->idx;
> +	EMIT(PPC_RAW_MFLR(_R0));
> +	EMIT(PPC_RAW_NOP());
> +	if (IS_ENABLED(CONFIG_FTRACE_PFE_OUT_OF_LINE))
> +		EMIT(PPC_RAW_MTLR(_R0));
> +	WARN_ON_ONCE(!is_offset_in_branch_range(4 - (long)ctx->idx * 4)); /* TODO */
> +	EMIT(PPC_RAW_BRANCH(4 - (long)ctx->idx * 4));
> +
> +	/*
> +	 * Long branch stub:
> +	 *	.long	<dummy_tramp_addr>
> +	 *	mflr	r11
> +	 *	bcl	20,31,$+4
> +	 *	mflr	r12
> +	 *	ld	r12, -8-SZL(r12)
> +	 *	mtctr	r12
> +	 *	mtlr	r11 // needed to retain ftrace ABI
> +	 *	bctr
> +	 */

You could avoid clobbering LR on >= POWER9 with addpcis instruction. Or
use a pcrel load with pcrel even. I guess that's something to do later.

Thanks,
Nick





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux