This commit changes the behavior of TC and XDP hooks during attachment such that any BPF_MAP_TYPE_PROG_ARRAY that the prog uses has an extra uref taken. The goal behind this change is to try and prevent confusion for the majority of use cases. The current behavior where when the last uref is dropped the prog array map is emptied is quite confusing. Confusing enough for there to be multiple references to it in ebpf-go [0][1]. Completely solving the problem is difficult. As stated in c9da161c6517 ("bpf: fix clearing on persistent program array maps"), it is difficult-to-impossible to walk the full dependency graph b/c it is too dynamic. However in practice, I've found that all progs in a tailcall chain share the same prog array map. Knowing that, if we take a uref on any used prog array map when the program is attached, we can simplify the majority use case and make it more ergonomic. I'll be the first to admit this is not a very clean solution. It does not fully solve the problem. Nor does it make overall logic any simpler. But I do think it makes a pretty big usability hole slightly smaller. [0]: https://github.com/cilium/ebpf/blob/01ebd4c1e2b9f8b3dd4fd2382aa1092c3c9bfc9d/doc.go#L22-L24 [1]: https://github.com/cilium/ebpf/blob/d1a52333f2c0fed085f8d742a5a3c164795d8492/collection.go#L320-L321 Signed-off-by: Daniel Xu <dxu@xxxxxxxxx> --- kernel/bpf/syscall.c | 40 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index d8e5530598f3..6706bb1c8e16 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2164,8 +2164,37 @@ void bpf_prog_put(struct bpf_prog *prog) } EXPORT_SYMBOL_GPL(bpf_prog_put); +/* Whether this program type, when attached, take a uref on prog array maps */ +static bool bpf_prog_should_pin_uref(enum bpf_prog_type type) +{ + switch (type) { + case BPF_PROG_TYPE_SCHED_CLS: + case BPF_PROG_TYPE_SCHED_ACT: + case BPF_PROG_TYPE_XDP: + return true; + default: + return false; + } +} + +static void __bpf_prog_add_prog_array_urefs(struct bpf_prog *prog, s64 cnt) +{ + struct bpf_map *map; + u32 i; + + mutex_lock(&prog->aux->used_maps_mutex); + for (i = 0; i < prog->aux->used_map_cnt; i++) { + map = prog->aux->used_maps[i]; + if (IS_FD_PROG_ARRAY(map)) + atomic64_add(cnt, &map->usercnt); + } + mutex_unlock(&prog->aux->used_maps_mutex); +} + void bpf_prog_put_dev(struct bpf_prog *prog) { + if (bpf_prog_should_pin_uref(prog->type)) + __bpf_prog_add_prog_array_urefs(prog, -1); bpf_prog_put(prog); } EXPORT_SYMBOL_GPL(bpf_prog_put_dev); @@ -2366,7 +2395,16 @@ struct bpf_prog *bpf_prog_get(u32 ufd) struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type, bool attach_drv) { - return __bpf_prog_get(ufd, &type, attach_drv); + struct bpf_prog *prog; + + prog = __bpf_prog_get(ufd, &type, attach_drv); + if (IS_ERR(prog)) + goto out; + + if (bpf_prog_should_pin_uref(type)) + __bpf_prog_add_prog_array_urefs(prog, 1); +out: + return prog; } EXPORT_SYMBOL_GPL(bpf_prog_get_type_dev); -- 2.41.0