Re: [PATCH bpf-next 01/10] bpf: Add initial fd-based API to attach tc BPF programs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/5/22 2:55 AM, sdf@xxxxxxxxxx wrote:
On 10/05, Daniel Borkmann wrote:
[...]

The series looks exciting, haven't had a chance to look deeply, will try
to find some time this week.

Great, thanks!

We've chatted briefly about priority during the talk, let's maybe discuss
it here more?

I, as a user, still really have no clue about what priority to use.
We have this problem at tc, and we'll seemingly have the same problem
here? I guess it's even more relevant in k8s because internally at G we
can control the users.

Is it worth at least trying to provide some default bands / guidance?

For example, having SEC('tc/ingress') receive attach_priority=124 by
default? Maybe we can even have something like 'tc/ingress_first' get
attach_priority=1 and 'tc/ingress_last' with attach_priority=254?
(the names are arbitrary, we can do something better)

ingress_first/ingress_last can be used by some monitoring jobs. The rest
can use default 124. If somebody really needs a custom priority, then they
can manually use something around 124/2 if they need to trigger before the
'default' priority or 124+124/2 if they want to trigger after?

Thoughts? Is it worth it? Do we care?

I think guidance is needed, yes, I can add a few paragraphs to the libbpf
header file where we also have the tc BPF link API. I had a brief discussion
around this also with developers from datadog as they also use the infra
via tc BPF. Overall, its a hard problem, and I don't think there's a good
generic solution. The '*_last' is implied by prio=0, so that kernel auto-
allocates it, and for libbpf we could add an API for it where the user
does not need to specify prio specifically. The 'appending' is reasonable
to me given if an application explicitly requests to be added as first
(and e.g. enforces policy through tc BPF), but some other 3rd party application
prepends itself as first, it can bypass the former, which would be too easy
to shoot yourself in the foot. Overall the issue in tc land is that ordering
matters, skb packet data could be mangled (e.g. IPs NATed), skb fields can
be mangled, and we can have redirect actions (dev A vs. B); the only way I'd
see were this is possible if somewhat verifier can annotate the prog when
it didn't observe any writes to skb, and no redirect was in play. Then you've
kind of replicated the constraints similar to tracing where the attachment
can say that ordering doesn't matter if all the progs are in same style.
Otherwise, explicit corporation is needed as is today with rest of tc (or
as Toke did in libxdp) with multi-attach. In the specific case I mentioned
at LPC, it can be made to work given one of the two is only observing traffic
at the layer, e.g. it could get prepended if there is guarantee that all
return codes are tc_act_unspec so that there is no bypass and then you'll
see all traffic or appended to see only traffic which made it past the
policy. So it all depends on the applications installing programs, but to
solve it generically is not possible given ordering and conflicting actions.
So, imho, an _append() API for libbpf can be added along with guidance for
developers when to use _append() vs explicit prio.

Thanks,
Daniel

      };

      struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */
@@ -1452,7 +1460,10 @@ union bpf_attr {
      } info;

      struct { /* anonymous struct used by BPF_PROG_QUERY command */
-        __u32        target_fd;    /* container object to query */
+        union {
+            __u32    target_fd;    /* container object to query */
+            __u32    target_ifindex; /* target ifindex */



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux