Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> [Mon, 2020-04-13 15:00 -0700]: > On Mon, Apr 13, 2020 at 1:21 PM Andrey Ignatov <rdna@xxxxxx> wrote: > > > > Andrii Nakryiko <andriin@xxxxxx> [Sat, 2020-04-11 22:58 -0700]: > > > For some types of BPF programs that utilize expected_attach_type, libbpf won't > > > set load_attr.expected_attach_type, even if expected_attach_type is known from > > > section definition. This was done to preserve backwards compatibility with old > > > kernels that didn't recognize expected_attach_type attribute yet (which was > > > added in 5e43f899b03a ("bpf: Check attach type at prog load time"). But this > > > is problematic for some BPF programs that utilize never features that require > > > kernel to know specific expected_attach_type (e.g., extended set of return > > > codes for cgroup_skb/egress programs). > > > > > > This patch makes libbpf specify expected_attach_type by default, but also > > > detect support for this field in kernel and not set it during program load. > > > This allows to have a good metadata for bpf_program > > > (e.g., bpf_program__get_extected_attach_type()), but still work with old > > > kernels (for cases where it can work at all). > > > > > > Additionally, due to expected_attach_type being always set for recognized > > > program types, bpf_program__attach_cgroup doesn't have to do extra checks to > > > determine correct attach type, so remove that additional logic. > > > > > > Also adjust section_names selftest to account for this change. > > > > > > More detailed discussion can be found in [0]. > > > > > > [0] https://lore.kernel.org/bpf/20200412003604.GA15986@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > > > > > Reported-by: Andrey Ignatov <rdna@xxxxxx> > > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > > > --- > > > tools/lib/bpf/libbpf.c | 123 +++++++++++------- > > > .../selftests/bpf/prog_tests/section_names.c | 42 +++--- > > > 2 files changed, 106 insertions(+), 59 deletions(-) > > > > > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > > > index ff9174282a8c..925f720deea0 100644 > > > --- a/tools/lib/bpf/libbpf.c > > > +++ b/tools/lib/bpf/libbpf.c > > > @@ -178,6 +178,8 @@ struct bpf_capabilities { > > > __u32 array_mmap:1; > > > /* BTF_FUNC_GLOBAL is supported */ > > > __u32 btf_func_global:1; > > > + /* kernel support for expected_attach_type in BPF_PROG_LOAD */ > > > + __u32 exp_attach_type:1; > > > }; > > > > > > enum reloc_type { > > > @@ -194,6 +196,22 @@ struct reloc_desc { > > > int sym_off; > > > }; > > > > > > +struct bpf_sec_def; > > > + > > > +typedef struct bpf_link *(*attach_fn_t)(const struct bpf_sec_def *sec, > > > + struct bpf_program *prog); > > > + > > > +struct bpf_sec_def { > > > + const char *sec; > > > + size_t len; > > > + enum bpf_prog_type prog_type; > > > + enum bpf_attach_type expected_attach_type; > > > + bool is_exp_attach_type_optional; > > > + bool is_attachable; > > > + bool is_attach_btf; > > > + attach_fn_t attach_fn; > > > +}; > > > + > > > /* > > > * bpf_prog should be a better name but it has been used in > > > * linux/filter.h. > > > @@ -204,6 +222,7 @@ struct bpf_program { > > > char *name; > > > int prog_ifindex; > > > char *section_name; > > > + const struct bpf_sec_def *sec_def; > > > /* section_name with / replaced by _; makes recursive pinning > > > * in bpf_object__pin_programs easier > > > */ > > > @@ -3315,6 +3334,32 @@ static int bpf_object__probe_array_mmap(struct bpf_object *obj) > > > return 0; > > > } > > > > > > +static int > > > +bpf_object__probe_exp_attach_type(struct bpf_object *obj) > > > +{ > > > + struct bpf_load_program_attr attr; > > > + struct bpf_insn insns[] = { > > > + BPF_MOV64_IMM(BPF_REG_0, 0), > > > + BPF_EXIT_INSN(), > > > + }; > > > + int fd; > > > + > > > + memset(&attr, 0, sizeof(attr)); > > > + attr.prog_type = BPF_PROG_TYPE_CGROUP_SOCK; > > > + attr.expected_attach_type = BPF_CGROUP_INET_EGRESS; > > > > Could you clarify semantics of this function please? > > > > According to the name it looks like it should check whether > > expected_attach_type attribute is supported or not. But > > BPF_CGROUP_INET_EGRESS doesn't align with this since > > expected_attach_type itself was added long before it was supported for > > BPF_CGROUP_INET_EGRESS. > > > > For example 4fbac77d2d09 ("bpf: Hooks for sys_bind") added in Mar 2018 is > > the first hook ever that used expected_attach_type. > > > > aac3fc320d94 ("bpf: Post-hooks for sys_bind") added a bit later is the > > first hook that made expected_attach_type optional (for > > BPF_CGROUP_INET_SOCK_CREATE). > > > > But 5cf1e9145630 ("bpf: cgroup inet skb programs can return 0 to 3") for > > BPF_CGROUP_INET_EGRESS was merged more than a year after the previous > > two. > > I'm checking if kernel is rejecting non-zero expected_attach_type > field in bpf_attr for BPF_PROG_LOAD command. > > Before 5e43f899b03a ("bpf: Check attach type at prog load time"), > kernel would reject non-zero expected_attach_type because > expected_attach_type didn't exist in bpf_attr. So if that's the case, > we shouldn't specify expected_attach_type. > > After that commit, BPF_CGROUP_INET_EGRESS for > BPF_PROG_TYPE_CGROUP_SOCK would be supported, even if it is optional, > so using that combination should work. > > Did I miss something? So you're saying that there is nothing special about BPF_CGROUP_INET_EGRESS, you just need _any_ attach type and it just happened so that you used this one. That sounds fine, but could you clarify it with a comment please, otherwise it looks confusing: "why BPF_CGROUP_INET_EGRESS and not some other attach type, for example any of those which got optional expected_attach_type earlier, what's so special about BPF_CGROUP_INET_EGRESS". Also, I just realized that this combination: BPF_PROG_TYPE_CGROUP_SOCK and BPF_CGROUP_INET_EGRESS is incorrect: BPF_CGROUP_INET_EGRESS attach type corresponds to BPF_PROG_TYPE_CGROUP_SKB prog type, not to BPF_PROG_TYPE_CGROUP_SOCK. This call will always fail with EINVAL on new kernels, because of this code in bpf_prog_load_check_attach: switch (prog_type) { case BPF_PROG_TYPE_CGROUP_SOCK: switch (expected_attach_type) { case BPF_CGROUP_INET_SOCK_CREATE: case BPF_CGROUP_INET4_POST_BIND: case BPF_CGROUP_INET6_POST_BIND: return 0; default: return -EINVAL; } That should be fixed. And since this has to be changed anyway I'd go with BPF_PROG_TYPE_CGROUP_SOCK and BPF_CGROUP_INET_SOCK_CREATE combination since this is the very first combination in kernel that relied on optional expected_attach_type -- that would make more sense IMO. And clarifying comment as mentioned above :) > > > + attr.insns = insns; > > > + attr.insns_cnt = ARRAY_SIZE(insns); > > > + attr.license = "GPL"; > > > + > > > + fd = bpf_load_program_xattr(&attr, NULL, 0); > > > + if (fd >= 0) { > > > + obj->caps.exp_attach_type = 1; > > > + close(fd); > > > + return 1; > > > + } > > > + return 0; > > > +} > > > + > > > static int > > > bpf_object__probe_caps(struct bpf_object *obj) > > > { > > > @@ -3325,6 +3370,7 @@ bpf_object__probe_caps(struct bpf_object *obj) > > > bpf_object__probe_btf_func_global, > > > bpf_object__probe_btf_datasec, > > > bpf_object__probe_array_mmap, > > > + bpf_object__probe_exp_attach_type, > > > }; > > > int i, ret; > > > > > > @@ -4861,7 +4907,13 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, > > > > > > memset(&load_attr, 0, sizeof(struct bpf_load_program_attr)); > > > load_attr.prog_type = prog->type; > > > - load_attr.expected_attach_type = prog->expected_attach_type; > > > + /* old kernels might not support specifying expected_attach_type */ > > > + if (!prog->caps->exp_attach_type && prog->sec_def && > > > + prog->sec_def->is_exp_attach_type_optional) > > > + load_attr.expected_attach_type = 0; > > > + else > > > + load_attr.expected_attach_type = prog->expected_attach_type; > > > > I'm having a hard time checking whether it'll work for all cases or may > > not work for some combination of prog/attach type and kernel version > > since there are many subtleties. > > > > For example BPF_PROG_TYPE_CGROUP_SOCK has both a hook where > > expected_attach_type is optional (BPF_CGROUP_INET_SOCK_CREATE) and hooks > > where it's required (BPF_CGROUP_INET{4,6}_POST_BIND), and there > > bpf_prog_load_fixup_attach_type() function in always sets > > expected_attach_type if it's not yet. > > Right, so we use the fact that they are allowed, even if optional. > Libbpf should provide correct expected_attach_type, according to > section definitions and kernel should be happy (unless user specified > wrong section name, of course, but we can't help that). > > > > > But I don't have context on all hooks that can be affected by this > > change and could easily miss something. > > > > Ideally it should be verified by tests. Current section_names.c test > > only verifies what will be returned, but AFAIK there is no test that > > checks whether provided combination of prog_type/expected_attach_type at > > load time and attach_type at attach time would actually work both on > > current and old kernels. Do you think it's possible to add such a > > selftest? (current libbpf CI supports running on old kernels, doesn't > > it?) > > So all the existing selftests are essentially verifying this, if run > on old kernel. I don't think libbpf currently runs tests on such old > kernels, though. But there is no extra selftest that we need to add, > because every single existing one will execute this piece of libbpf > logic. Apparently existing tests didn't catch the very obvious bug with BPF_PROG_TYPE_CGROUP_SOCK / BPF_CGROUP_INET_EGRESS invalid combination. I think it'd be useful to start with at least basic test focused on expected_attach_type. Then later extend it to new attach types when they're being added and, ideally, to existing ones. > > > + pr_warn("prog %s exp_Attach %d\n", prog->name, load_attr.expected_attach_type); > > > if (prog->caps->name) > > > load_attr.name = prog->name; > > > load_attr.insns = insns; > > > @@ -5062,6 +5114,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level) > > > return 0; > > > } > > > > > trimming irrelevant parts is good ;) ack -- Andrey Ignatov