Re: [PATCH bpf-next v1] ARC: Add eBPF JIT support

Björn Töpel <bjorn@xxxxxxxxxx> · Sun, 03 Mar 2024 18:18:48 +0100

Shahab,

Shahab Vahedi <list+bpf@xxxxxxxxxx> writes:

> From: Shahab Vahedi <shahab@xxxxxxxxxxxx>
>
> This will add eBPF JIT support to the 32-bit ARCv2 processors. The
> implementation is qualified by running the BPF tests on a Synopsys HSDK
> board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU.

Cool!

I did quick review, mosty focusing on style, and not function. Some
general input/Qs:

What's the easiest way to test test this w/o ARC HW? Is there a qemu
port avaiable?

I don't know much about ARC -- Is v2 compatible with v3?

I'm curious about the missing support; tailcall/atomic/division/extable
support. Would it require a lot of work to add that support in the
inital change set?

There are a lot of checkpatch/kernel style issues. Run, e.g.,
"checkpatch --strict -g HEAD" and you'll get a bunch of issues. Most of
them are just basic style issues. Please try to fix most of them for the
next rev.

You should add yourself to the MAINTAINERS file.

Please try to avoid static inline in the C-files. The compiler usually
knows better.

[...]

> diff --git a/arch/arc/net/bpf_jit_core.c b/arch/arc/net/bpf_jit_core.c
> new file mode 100644
> index 000000000000..730a715d324e
> --- /dev/null
> +++ b/arch/arc/net/bpf_jit_core.c
> @@ -0,0 +1,1425 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * The back-end-agnostic part of Just-In-Time compiler for eBPF bytecode.
> + *
> + * Copyright (c) 2024 Synopsys Inc.
> + * Author: Shahab Vahedi <shahab@xxxxxxxxxxxx>
> + */
> +#include <asm/bug.h>
> +#include "bpf_jit.h"
> +
> +/* Sane initial values for the globals */
> +bool emit = true;
> +bool zext_thyself = true;

Hmm, this is racy. Can we move this into the jit context? Also, is
zext_thyself even used?

> +
> +/*
> + * Check for the return value. A pattern used oftenly in this file.
> + * There must be a "ret" variable of type "int" in the scope.
> + */
> +#define CHECK_RET(cmd)			\
> +	do {				\
> +		ret = (cmd);		\
> +		if (ret < 0)		\
> +			return ret;	\
> +	} while (0)
> +

Nit/personal taste, but I prefer not having these kind of macros. I
think it makes it harder to read the code.

> +#ifdef ARC_BPF_JIT_DEBUG
> +/* Dumps bytes in /var/log/messages at KERN_INFO level (4). */
> +static void dump_bytes(const u8 *buf, u32 len, const char *header)
> +{
> +	u8 line[64];
> +	size_t i, j;
> +
> +	pr_info("-----------------[ %s ]-----------------\n", header);
> +
> +	for (i = 0, j = 0; i < len; i++) {
> +		/* Last input byte? */
> +		if (i == len-1) {
> +			j += scnprintf(line+j, 64-j, "0x%02x", buf[i]);
> +			pr_info("%s\n", line);
> +			break;
> +		}
> +		/* End of line? */
> +		else if (i % 8 == 7) {
> +			j += scnprintf(line+j, 64-j, "0x%02x", buf[i]);
> +			pr_info("%s\n", line);
> +			j = 0;
> +		} else {
> +			j += scnprintf(line+j, 64-j, "0x%02x, ", buf[i]);
> +		}
> +	}
> +}
> +#endif /* ARC_BPF_JIT_DEBUG */
> +
> +/********************* JIT context ***********************/
> +
> +/*
> + * buf:		Translated instructions end up here.
> + * len:		The length of whole block in bytes.
> + * index:	The offset at which the _next_ instruction may be put.
> + */
> +struct jit_buffer {
> +	u8	*buf;
> +	u32	len;
> +	u32	index;
> +};
> +
> +/*
> + * This is a subset of "struct jit_context" that its information is deemed
> + * necessary for the next extra pass to come.
> + *
> + * bpf_header:	Needed to finally lock the region.
> + * bpf2insn:	Used to find the translation for instructions of interest.
> + *
> + * Things like "jit.buf" and "jit.len" can be retrieved respectively from
> + * "prog->bpf_func" and "prog->jited_len".
> + */
> +struct arc_jit_data {
> +	struct bpf_binary_header *bpf_header;
> +	u32                      *bpf2insn;
> +};
> +
> +/*
> + * The JIT pertinent context that is used by different functions.
> + *
> + * prog:		The current eBPF program being handled.
> + * orig_prog:		The original eBPF program before any possible change.
> + * jit:			The JIT buffer and its length.
> + * bpf_header:		The JITed program header. "jit.buf" points inside it.
> + * bpf2insn:		Maps BPF insn indices to their counterparts in jit.buf.
> + * bpf2insn_valid:	Indicates if "bpf2ins" is populated with the mappings.
> + * jit_data:		A piece of memory to transfer data to the next pass.
> + * arc_regs_clobbered:	Each bit status determines if that arc reg is clobbered.
> + * save_blink:		Whether ARC's "blink" register needs to be saved.
> + * frame_size:		Derived from "prog->aux->stack_depth".
> + * epilogue_offset:	Used by early "return"s in the code to jump here.
> + * need_extra_pass:	A forecast if an "extra_pass" will occur.
> + * is_extra_pass:	Indicates if the current pass is an extra pass.
> + * user_bpf_prog:	True, if VM opcodes come from a real program.
> + * blinded:		True if "constant blinding" step returned a new "prog".
> + * success:		Indicates if the whole JIT went OK.
> + */
> +struct jit_context {
> +	struct bpf_prog			*prog;
> +	struct bpf_prog			*orig_prog;
> +	struct jit_buffer		jit;
> +	struct bpf_binary_header	*bpf_header;
> +	u32				*bpf2insn;
> +	bool				bpf2insn_valid;
> +	struct arc_jit_data		*jit_data;
> +	u32				arc_regs_clobbered;
> +	bool				save_blink;
> +	u16				frame_size;
> +	u32				epilogue_offset;
> +	bool				need_extra_pass;
> +	bool				is_extra_pass;
> +	bool				user_bpf_prog;
> +	bool				blinded;
> +	bool				success;
> +};
> +
> +/*
> + * If we're in ARC_BPF_JIT_DEBUG mode and the debug level is right, dump the
> + * input BPF stream. "bpf_jit_dump()" is not fully suited for this purpose.

Care to elaborate a bit more on ARC_BPF_JIT_DEBUG. This smells of
duplicated funtionality with bpf_jit_dump(), and the BUG()s are scary.

> + */
> +static void vm_dump(const struct bpf_prog *prog)
> +{
> +#ifdef ARC_BPF_JIT_DEBUG
> +	if (bpf_jit_enable > 1)
> +		dump_bytes((u8 *) prog->insns, 8*prog->len, " VM  ");
> +#endif
> +}
> +
> +/*
> + * If the right level of debug is set, dump the bytes. There are 2 variants
> + * of this function:
> + *
> + * 1. Use the standard bpf_jit_dump() which is meant only for JITed code.
> + * 2. Use the dump_bytes() to match its "vm_dump()" instance.
> + */
> +static void jit_dump(const struct jit_context *ctx)
> +{
> +#ifdef ARC_BPF_JIT_DEBUG
> +	u8 header[8];
> +#endif
> +	const int pass = ctx->is_extra_pass ? 2 : 1;
> +
> +	if (bpf_jit_enable <= 1 || !ctx->prog->jited)
> +		return;
> +
> +#ifdef ARC_BPF_JIT_DEBUG
> +	scnprintf(header, sizeof(header), "JIT:%d", pass);
> +	dump_bytes(ctx->jit.buf, ctx->jit.len, header);
> +	pr_info("\n");
> +#else
> +	bpf_jit_dump(ctx->prog->len, ctx->jit.len, pass, ctx->jit.buf);
> +#endif
> +}
> +
> +/* Initialise the context so there's no garbage. */
> +static int jit_ctx_init(struct jit_context *ctx, struct bpf_prog *prog)
> +{
> +	ctx->orig_prog = prog;
> +
> +	/* If constant blinding was requested but failed, scram. */
> +	ctx->prog = bpf_jit_blind_constants(prog);
> +	if (IS_ERR(ctx->prog))
> +		return PTR_ERR(ctx->prog);
> +	ctx->blinded = (ctx->prog == ctx->orig_prog ? false : true);
> +
> +	ctx->jit.buf            = NULL;
> +	ctx->jit.len            = 0;
> +	ctx->jit.index          = 0;
> +	ctx->bpf_header         = NULL;
> +	ctx->bpf2insn           = NULL;
> +	ctx->bpf2insn_valid     = false;
> +	ctx->jit_data           = NULL;
> +	ctx->arc_regs_clobbered = 0;
> +	ctx->save_blink         = false;
> +	ctx->frame_size         = 0;
> +	ctx->epilogue_offset    = 0;
> +	ctx->need_extra_pass    = false;
> +	ctx->is_extra_pass	= ctx->prog->jited;
> +	ctx->user_bpf_prog	= ctx->prog->is_func;
> +	ctx->success            = false;

I'd just make sure that ctx is zeroed, and init the non-zero members here.

Björn