On Fri, Apr 30, 2021 at 11:32 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Sat, May 01, 2021 at 01:05:40AM IST, Andrii Nakryiko wrote: > > On Wed, Apr 28, 2021 at 9:26 AM Kumar Kartikeya Dwivedi > > <memxor@xxxxxxxxx> wrote: > > > > > > This adds functions that wrap the netlink API used for adding, > > > manipulating, and removing traffic control filters. > > > > > > An API summary: > > > > > > A bpf_tc_hook represents a location where a TC-BPF filter can be > > > attached. This means that creating a hook leads to creation of the > > > backing qdisc, while destruction either removes all filters attached to > > > a hook, or destroys qdisc if requested explicitly (as discussed below). > > > > > > The TC-BPF API functions operate on this bpf_tc_hook to attach, replace, > > > query, and detach tc filters. > > > > > > All functions return 0 on success, and a negative error code on failure. > > > [...] > > > > > > Reviewed-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > > > --- > > > > API looks good to me (except the flags field that just stands out). > > But I'll defer to Daniel to make the final call. > > > > > tools/lib/bpf/libbpf.h | 41 ++++ > > > tools/lib/bpf/libbpf.map | 5 + > > > tools/lib/bpf/netlink.c | 463 ++++++++++++++++++++++++++++++++++++++- > > > 3 files changed, 508 insertions(+), 1 deletion(-) > > > > > > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h > > > index bec4e6a6e31d..3de701f46a33 100644 > > > --- a/tools/lib/bpf/libbpf.h > > > +++ b/tools/lib/bpf/libbpf.h > > > @@ -775,6 +775,47 @@ LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker, const char *filen > > > LIBBPF_API int bpf_linker__finalize(struct bpf_linker *linker); > > > LIBBPF_API void bpf_linker__free(struct bpf_linker *linker); > > > > > > +enum bpf_tc_attach_point { > > > + BPF_TC_INGRESS = 1 << 0, > > > + BPF_TC_EGRESS = 1 << 1, > > > + BPF_TC_CUSTOM = 1 << 2, > > > +}; > > > + > > > +enum bpf_tc_attach_flags { > > > + BPF_TC_F_REPLACE = 1 << 0, > > > +}; > > > + > > > +struct bpf_tc_hook { > > > + size_t sz; > > > + int ifindex; > > > + enum bpf_tc_attach_point attach_point; > > > + __u32 parent; > > > + size_t :0; > > > +}; > > > + > > > +#define bpf_tc_hook__last_field parent > > > + > > > +struct bpf_tc_opts { > > > + size_t sz; > > > + int prog_fd; > > > + __u32 prog_id; > > > + __u32 handle; > > > + __u32 priority; > > > + size_t :0; > > > +}; > > > + > > > +#define bpf_tc_opts__last_field priority > > > + > > > +LIBBPF_API int bpf_tc_hook_create(struct bpf_tc_hook *hook, int flags); > > > +LIBBPF_API int bpf_tc_hook_destroy(struct bpf_tc_hook *hook); > > > +LIBBPF_API int bpf_tc_attach(const struct bpf_tc_hook *hook, > > > + struct bpf_tc_opts *opts, > > > + int flags); > > > > why didn't you put flags into bpf_tc_opts? they are clearly optional > > and fit into "opts" paradigm... > > > > I can move this into opts, but during previous discussion it was kept outside > opts by Daniel, so I kept that unchanged. for bpf_tc_attach() I see no reason to keep flags separate. For bpf_tc_hook_create()... for extensibility it would need it's own opts for hook creation. But if flags is 99% the only thing we'll need, then we can always add extra bpf_tc_hook_create_opts() later. > > > > +LIBBPF_API int bpf_tc_detach(const struct bpf_tc_hook *hook, > > > + const struct bpf_tc_opts *opts); > > > +LIBBPF_API int bpf_tc_query(const struct bpf_tc_hook *hook, > > > + struct bpf_tc_opts *opts); > > > + > > > #ifdef __cplusplus > > > } /* extern "C" */ > > > #endif [...] > > > + return -EINVAL; > > > + > > > + return tc_qdisc_create_excl(hook, flags); > > > +} > > > + > > > +static int tc_cls_detach(const struct bpf_tc_hook *hook, > > > + const struct bpf_tc_opts *opts, bool flush); > > > + > > > +int bpf_tc_hook_destroy(struct bpf_tc_hook *hook) > > > +{ > > > + if (!hook || !OPTS_VALID(hook, bpf_tc_hook) || > > > + OPTS_GET(hook, ifindex, 0) <= 0) > > > + return -EINVAL; > > > + > > > + switch ((int)OPTS_GET(hook, attach_point, 0)) { > > > > int casting. Did the compiler complain about that or what? > > > > It complains on -Wswitch, as we switch on values apart from the enum values, but > I'll see if I can remove it. ah, because of BPF_TC_INGRESS|BPF_TC_EGRESS? That sucks, of course. An alternative I guess is just declaring BPF_TC_INGRESS_EGRESS = BPF_TC_INGRESS | BPF_TC_EGRESS, but I don't know how awful that would be. > > > > + case BPF_TC_INGRESS: > > > + case BPF_TC_EGRESS: > > > + return tc_cls_detach(hook, NULL, true); > > > + case BPF_TC_INGRESS|BPF_TC_EGRESS: > > > + return tc_qdisc_delete(hook); > > > + case BPF_TC_CUSTOM: > > > + return -EOPNOTSUPP; > > > + default: > > > + return -EINVAL; > > > + } > > > +} > > > + [...]