On 1/11/21 1:29 PM, Andrii Nakryiko wrote:
On Sun, Jan 10, 2021 at 8:13 PM Yonghong Song <yhs@xxxxxx> wrote:
On 1/8/21 2:09 PM, Andrii Nakryiko wrote:
Add support for directly accessing kernel module variables from BPF programs
using special ldimm64 instructions. This functionality builds upon vmlinux
ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
specifying kernel module BTF's FD in insn[1].imm field.
During BPF program load time, verifier will resolve FD to BTF object and will
take reference on BTF object itself and, for module BTFs, corresponding module
as well, to make sure it won't be unloaded from under running BPF program. The
mechanism used is similar to how bpf_prog keeps track of used bpf_maps.
One interesting change is also in how per-CPU variable is determined. The
logic is to find .data..percpu data section in provided BTF, but both vmlinux
and module each have their own .data..percpu entries in BTF. So for module's
case, the search for DATASEC record needs to look at only module's added BTF
types. This is implemented with custom search function.
Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
Ack with a minor nit below.
Acked-by: Yonghong Song <yhs@xxxxxx>
---
include/linux/bpf.h | 10 +++
include/linux/bpf_verifier.h | 3 +
include/linux/btf.h | 3 +
kernel/bpf/btf.c | 31 +++++++-
kernel/bpf/core.c | 23 ++++++
kernel/bpf/verifier.c | 149 ++++++++++++++++++++++++++++-------
6 files changed, 189 insertions(+), 30 deletions(-)
[...]
/* replace pseudo btf_id with kernel symbol address */
static int check_pseudo_btf_id(struct bpf_verifier_env *env,
struct bpf_insn *insn,
[...]
} else {
aux->btf_var.reg_type = PTR_TO_BTF_ID;
- aux->btf_var.btf = btf_vmlinux;
+ aux->btf_var.btf = btf;
aux->btf_var.btf_id = type;
}
+
+ /* check whether we recorded this BTF (and maybe module) already */
+ for (i = 0; i < env->used_btf_cnt; i++) {
+ if (env->used_btfs[i].btf == btf) {
+ btf_put(btf);
+ return 0;
An alternative way is to change the above code as
err = 0;
goto err_put;
I didn't do it, because it's not really an error case, which err_put
implies. If in the future we'll have some more clean up to do, it
might make sense, I suppose.
You can change label err_put to btf_put, so this way, btf_put() will
only show up in one place. But I won't insist on this.
+ }
+ }
+
+ if (env->used_btf_cnt >= MAX_USED_BTFS) {
+ err = -E2BIG;
+ goto err_put;
+ }
+
+ btf_mod = &env->used_btfs[env->used_btf_cnt];
+ btf_mod->btf = btf;
+ btf_mod->module = NULL;
+
+ /* if we reference variables from kernel module, bump its refcount */
+ if (btf_is_module(btf)) {
+ btf_mod->module = btf_try_get_module(btf);
+ if (!btf_mod->module) {
+ err = -ENXIO;
+ goto err_put;
+ }
+ }
+
+ env->used_btf_cnt++;
+
return 0;
+err_put:
+ btf_put(btf);
+ return err;
}
[...]