Re: [PATCH bpf-next v3 2/3] libbpf: add low level TC-BPF API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/22/21 5:43 AM, Andrii Nakryiko wrote:
On Wed, Apr 21, 2021 at 3:59 PM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
On 4/20/21 9:37 PM, Kumar Kartikeya Dwivedi wrote:
[...]
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index bec4e6a6e31d..b4ed6a41ea70 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -16,6 +16,8 @@
   #include <stdbool.h>
   #include <sys/types.h>  // for size_t
   #include <linux/bpf.h>
+#include <linux/pkt_sched.h>
+#include <linux/tc_act/tc_bpf.h>

   #include "libbpf_common.h"

@@ -775,6 +777,48 @@ LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker, const char *filen
   LIBBPF_API int bpf_linker__finalize(struct bpf_linker *linker);
   LIBBPF_API void bpf_linker__free(struct bpf_linker *linker);

+/* Convenience macros for the clsact attach hooks */
+#define BPF_TC_CLSACT_INGRESS TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS)
+#define BPF_TC_CLSACT_EGRESS TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_EGRESS)

I would abstract those away into an enum, plus avoid having to pull in
linux/pkt_sched.h and linux/tc_act/tc_bpf.h from main libbpf.h header.

Just add a enum { BPF_TC_DIR_INGRESS, BPF_TC_DIR_EGRESS, } and then the
concrete tc bits (TC_H_MAKE()) can be translated internally.

+struct bpf_tc_opts {
+     size_t sz;

Is this set anywhere?

+     __u32 handle;
+     __u32 class_id;

I'd remove class_id from here as well given in direct-action a BPF prog can
set it if needed.

+     __u16 priority;
+     bool replace;
+     size_t :0;

What's the rationale for this padding?

+};
+
+#define bpf_tc_opts__last_field replace
+
+/* Acts as a handle for an attached filter */
+struct bpf_tc_attach_id {

nit: maybe bpf_tc_ctx

ok, so wait. It seems like apart from INGRESS|EGRESS enum and ifindex,
everything else is optional and/or has some sane defaults, right? So
this bpf_tc_attach_id or bpf_tc_ctx seems a bit artificial construct
and it will cause problems for extending this.

So if my understanding is correct, I'd get rid of it completely. As I
said previously, opts allow returning parameters back, so if user
didn't specify handle and priority and kernel picks values on user's
behalf, we can return them in the same opts fields.

For detach, again, ifindex and INGRESS|EGRESS is sufficient, but if
user want to provide more detailed parameters, we should do that
through extensible opts. That way we can keep growing this easily,
plus simple cases will remain simple.

Similarly bpf_tc_info below, there is no need to have struct
bpf_tc_attach_id id; field, just have handle and priority right there.
And bpf_tc_info should use OPTS framework for extensibility (though
opts name doesn't fit it very well, but it is still nice for
extensibility and for doing optional input/output params).

Does this make sense? Am I missing something crucial here?

I would probably keep the handle + priority in there; maybe if both are 0,
we could fix it to some default value internally, but without those it might
be a bit hard if people want to build a 'pipeline' of cls_bpf progs if they
need/want to.

Potentially, one could fixate the handle itself, and then allow to specify
different priorities for it such that when a BPF prog returns a TC_ACT_UNSPEC,
it will exec the next one inside that cls_bpf instance, every other TC_ACT_*
opcode will terminate the processing. Technically, only priority would really
be needed (unless you combine multiple different classifiers from tc side on
the ingress/egress hook which is not great to begin with, tbh).

The general rule with any new structs added to libbpf APIs is to
either be 100% (ok, 99.99%) sure that they will never be changed, or
do forward/backward compatible OPTS. Any other thing is pain and calls
for symbol versioning, which we are trying really hard to avoid.

+     __u32 handle;
+     __u16 priority;
+};
+
+struct bpf_tc_info {
+     struct bpf_tc_attach_id id;
+     __u16 protocol;
+     __u32 chain_index;
+     __u32 prog_id;
+     __u8 tag[BPF_TAG_SIZE];
+     __u32 class_id;
+     __u32 bpf_flags;
+     __u32 bpf_flags_gen;

Given we do not yet have any setters e.g. for offload, etc, the one thing
I'd see useful and crucial initially is prog_id.

The protocol, chain_index, and I would also include tag should be dropped.
Similarly class_id given my earlier statement, and flags I would extend once
this lib API would support offloading progs.

+};
+
+/* id is out parameter that will be written to, it must not be NULL */
+LIBBPF_API int bpf_tc_attach(int fd, __u32 ifindex, __u32 parent_id,
+                          const struct bpf_tc_opts *opts,
+                          struct bpf_tc_attach_id *id);
+LIBBPF_API int bpf_tc_detach(__u32 ifindex, __u32 parent_id,
+                          const struct bpf_tc_attach_id *id);
+LIBBPF_API int bpf_tc_get_info(__u32 ifindex, __u32 parent_id,
+                            const struct bpf_tc_attach_id *id,
+                            struct bpf_tc_info *info);

As per above, for parent_id I'd replace with dir enum.

+
   #ifdef __cplusplus
   } /* extern "C" */
   #endif




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux