On Thu, Mar 18, 2021 at 06:03:49PM -0700, Andrii Nakryiko wrote: > On Tue, Mar 16, 2021 at 12:01 AM Martin KaFai Lau <kafai@xxxxxx> wrote: > > > > This patch adds support to BPF verifier to allow bpf program calling > > kernel function directly. > > > > The use case included in this set is to allow bpf-tcp-cc to directly > > call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()"). Those > > functions have already been used by some kernel tcp-cc implementations. > > > > This set will also allow the bpf-tcp-cc program to directly call the > > kernel tcp-cc implementation, For example, a bpf_dctcp may only want to > > implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly > > from the kernel tcp_dctcp.c instead of reimplementing (or > > copy-and-pasting) them. > > > > The tcp-cc kernel functions mentioned above will be white listed > > for the struct_ops bpf-tcp-cc programs to use in a later patch. > > The white listed functions are not bounded to a fixed ABI contract. > > Those functions have already been used by the existing kernel tcp-cc. > > If any of them has changed, both in-tree and out-of-tree kernel tcp-cc > > implementations have to be changed. The same goes for the struct_ops > > bpf-tcp-cc programs which have to be adjusted accordingly. > > > > This patch is to make the required changes in the bpf verifier. > > > > First change is in btf.c, it adds a case in "do_btf_check_func_arg_match()". > > When the passed in "btf->kernel_btf == true", it means matching the > > verifier regs' states with a kernel function. This will handle the > > PTR_TO_BTF_ID reg. It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET, > > and PTR_TO_TCP_SOCK to its kernel's btf_id. > > > > In the later libbpf patch, the insn calling a kernel function will > > look like: > > > > insn->code == (BPF_JMP | BPF_CALL) > > insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */ > > insn->imm == func_btf_id /* btf_id of the running kernel */ > > > > [ For the future calling function-in-kernel-module support, an array > > of module btf_fds can be passed at the load time and insn->off > > can be used to index into this array. ] > > > > At the early stage of verifier, the verifier will collect all kernel > > function calls into "struct bpf_kern_func_descriptor". Those > > descriptors are stored in "prog->aux->kfunc_tab" and will > > be available to the JIT. Since this "add" operation is similar > > to the current "add_subprog()" and looking for the same insn->code, > > they are done together in the new "add_subprog_and_kern_func()". > > > > In the "do_check()" stage, the new "check_kern_func_call()" is added > > to verify the kernel function call instruction: > > 1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE. > > A new bpf_verifier_ops "check_kern_func_call" is added to do that. > > The bpf-tcp-cc struct_ops program will implement this function in > > a later patch. > > 2. Call "btf_check_kern_func_args_match()" to ensure the regs can be > > used as the args of a kernel function. > > 3. Mark the regs' type, subreg_def, and zext_dst. > > > > At the later do_misc_fixups() stage, the new fixup_kern_func_call() > > will replace the insn->imm with the function address (relative > > to __bpf_call_base). If needed, the jit can find the btf_func_model > > by calling the new bpf_jit_find_kern_func_model(prog, insn->imm). > > With the imm set to the function address, "bpftool prog dump xlated" > > will be able to display the kernel function calls the same way as > > it displays other bpf helper calls. > > > > gpl_compatible program is required to call kernel function. > > > > This feature currently requires JIT. > > > > Signed-off-by: Martin KaFai Lau <kafai@xxxxxx> > > --- > > After the initial pass it all makes sense so far. I am a bit concerned > about s32 and kernel function offset, though. See below. > > Also "kern_func" and "descriptor" are quite mouthful, it seems to me > that using kfunc consistently wouldn't hurt readability at all. You > also already use desc in place of "descriptor" for variables, so I'd > do that in type names as well. The descriptor/desc naming follows the existing poke descriptor and some of its helper naming. Sure. both can be renamed in v2. s/descriptor/desc/ s/kern_func/kfunc/ > > +static int kern_func_desc_cmp_by_imm(const void *a, const void *b) > > +{ > > + const struct bpf_kern_func_descriptor *d0 = a; > > + const struct bpf_kern_func_descriptor *d1 = b; > > + > > + return d0->imm - d1->imm; > > this is not safe, assuming any possible s32 values, no? Good catch. will fix.