Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> [Sat, 2020-04-11 15:42 -0700]: > On Fri, Apr 10, 2020 at 03:57:39PM -0700, Andrey Ignatov wrote: > > > > > Seems like kernels accept expected_attach_type > > > for a while now, so it might be ok backwards compatibility-wise? > > > > Sure, that commit is from 2018, but I guess backward compatibility > > should still be maintained in this case? That's a question to > > maintainers though. If simply changing BPF_APROG_SEC to BPF_EAPROG_SEC > > is an option that works for me. > > > > > > > Otherwise, we can teach libbpf to retry program load without expected > > > attach type for cgroup_skb/egress? > > > > Looks like a lot of work compared to simply adding a new section name > > (or changing existing section if backward compatibility is not a concern > > here). > > I still don't understand that backward compatiblity concern. > Fixing libbpf to do BPF_EAPROG_SEC("cgroup_skb/egress" > will make egress progs to fail at load time if they use > 1 return value on old > kernels No. Changing BPF_APROG_SEC to BPF_EAPROG_SEC will fail loading programs on old kernels with any return value, incl. 0 and 1. That's the point. > and fail at load time for > 3 return value on new kernels. Without > libbpf fix such progs would rely on old and new kernels internal implementation > details. Since on the latest kernel with current libbpf behavior the egress > prog will get loaded as ingress type with return value 4 and then gets attached > at egress. Argh. One really need to deep dive into kernel sources to figure out > what kernel will do with such return value. Such behavior is undefined and broken. > Did I misunderstand the whole issue? Let me try to explain with an example. I patched libbpf and built bpftool with this patch: 17:00:43 0 rdna@dev082.prn2:~/bpf-next$>git di tools/lib/bpf/ diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index ff9174282a8c..cc4d5b74e64a 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6330,7 +6330,7 @@ static const struct bpf_sec_def section_defs[] = { BPF_PROG_SEC("lwt_seg6local", BPF_PROG_TYPE_LWT_SEG6LOCAL), BPF_APROG_SEC("cgroup_skb/ingress", BPF_PROG_TYPE_CGROUP_SKB, BPF_CGROUP_INET_INGRESS), - BPF_APROG_SEC("cgroup_skb/egress", BPF_PROG_TYPE_CGROUP_SKB, + BPF_EAPROG_SEC("cgroup_skb/egress", BPF_PROG_TYPE_CGROUP_SKB, BPF_CGROUP_INET_EGRESS), BPF_APROG_COMPAT("cgroup/skb", BPF_PROG_TYPE_CGROUP_SKB), BPF_APROG_SEC("cgroup/sock", BPF_PROG_TYPE_CGROUP_SOCK, Wrote a simple cgroup_skb/egress program that returns 1 and built it: 17:00:59 0 rdna@dev082.prn2:~/bpf-next$>cat tools/testing/selftests/bpf/progs/skb_ret1.c #include <linux/bpf.h> #include <bpf/bpf_helpers.h> int _version SEC("version") = 1; char _license[] SEC("license") = "GPL"; SEC("cgroup_skb/egress") int skb_ret1(struct __sk_buff *skb) { return 1; } Made sure that it can be loaded on new kernel: 17:00:39 0 rdna@dev082.prn2:~/bpf-next$>uname -srm Linux 5.2.9-93_fbk11_rc1_3610_g6a108f4f4a2b x86_64 17:01:10 0 rdna@dev082.prn2:~/bpf-next$>sudo ./tools/bpf/bpftool/bpftool p load tools/testing/selftests/bpf/skb_ret1.o /sys/fs/bpf/skb_ret1 17:01:25 0 rdna@dev082.prn2:~/bpf-next$>sudo ./tools/bpf/bpftool/bpftool p l pinned /sys/fs/bpf/skb_ret1 87347: cgroup_skb name skb_ret1 tag 9bf8e2a8bee581f5 gpl loaded_at 2020-04-11T17:01:25-0700 uid 0 xlated 16B jited 40B memlock 4096B btf_id 4952 Then copied both bpftool and the program to a server with old kernel (that doesn't have 5e43f899b03a ("bpf: Check attach type at prog load time")) and tried same thing (note "./bpftool"): [root@<some_host> ~]# uname -srm Linux 4.11.3-45_fbk11_3602_gd67c71c x86_64 [root@<some_host> ~]# ./bpftool p load ./skb_ret1.o /sys/fs/bpf/skb_ret1 libbpf: Error loading BTF: Invalid argument(22) libbpf: Error loading .BTF into kernel: -22. libbpf: load bpf program failed: Invalid argument libbpf: failed to load program 'cgroup_skb/egress' libbpf: failed to load object './skb_ret1.o' Error: failed to load object file [root@<some_host> ~]# echo $? 255 [root@<some_host> ~]# ./bpftool p l pinned /sys/fs/bpf/skb_ret1 Error: bpf obj get (/sys/fs/bpf/skb_ret1): No such file or directory As you can see it fails to load (BTF errors are not relevant). Then I tried to load same program on same old kernel but with prod bpftool (w/o my patch) just to make sure BTF is not a problem and it loads fine: [root@<some_host> ~]# bpftool p load ./skb_ret1.o /sys/fs/bpf/skb_ret1 libbpf: Error loading BTF: Invalid argument(22) libbpf: Error loading .BTF into kernel: -22. [root@<some_host> ~]# echo $? 0 [root@<some_host> ~]# bpftool p l pinned /sys/fs/bpf/skb_ret1 23944: cgroup_skb name skb_ret1 tag 9bf8e2a8bee581f5 loaded_at 2020-04-11T17:12:07-0700 uid 0 xlated 16B jited 64B memlock 4096B [root@<some_host> ~]# That's expected becase the error at loading time happens long before running verifier and doesn't have anything to do with what program returns. It fails on `if (CHECK_ATTR(BPF_PROG_LOAD))` in kernel's bpf_prog_load() -- the very first check that happens in that function. Old kernel checks that passed by user-space bpf_attr has all bytes after bpf_attr.prog_ifindex (the BPF_PROG_LOAD_LAST_FIELD known to that kernel) zero, but finds that there is non-zero unknown field, that is expected_attach_type, and fails. So changing BPF_APROG_SEC to BPF_EAPROG_SEC will break loading cgroup skb egress progs on any kernel before 5e43f899b03a ("bpf: Check attach type at prog load time"), no matter what those programs do. That's why I think it's not a good idea. Does it clarify? -- Andrey Ignatov