Re: [PATCH RFC bpf-next 0/7] Add bpf_link based TC-BPF API

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Wed, 16 Jun 2021 21:02:09 +0530

On Wed, Jun 16, 2021 at 08:10:55PM IST, Jamal Hadi Salim wrote:
> On 2021-06-15 7:07 p.m., Daniel Borkmann wrote:
> > On 6/13/21 11:10 PM, Jamal Hadi Salim wrote:
>
> [..]
>
> > >
> > > I look at it from the perspective that if i can run something with
> > > existing tc loading mechanism then i should be able to do the same
> > > with the new (libbpf) scheme.
> >
> > The intention is not to provide a full-blown tc library (that could be
> > subject to a
> > libtc or such), but rather to only have libbpf abstract the tc related
> > API that is
> > most /relevant/ for BPF program development and /efficient/ in terms of
> > execution in
> > fast-path while at the same time providing a good user experience from
> > the API itself.
> >
> > That is, simple to use and straight forward to explain to folks with
> > otherwise zero
> > experience of tc. The current implementation does all that, and from
> > experience with
> > large BPF programs managed via cls_bpf that is all that is actually
> > needed from tc
> > layer perspective. The ability to have multi programs (incl. priorities)
> > is in the
> > existing libbpf API as well.
> >
>
> Which is a fair statement, but if you take away things that work fine
> with current iproute2 loading I have no motivation to migrate at all.
> Its like that saying of "throwing out the baby with the bathwater".
> I want my baby.
>
> In particular, here's a list from Kartikeya's implementation:
>
> 1) Direct action mode only
> 2) Protocol ETH_P_ALL only
> 3) Only at chain 0
> 4) No block support
>

Block is supported, you just need to set TCM_IFINDEX_MAGIC_BLOCK as ifindex and
parent as the block index. There isn't anything more to it than that from libbpf
side (just specify BPF_TC_CUSTOM enum).

What I meant was that hook_create doesn't support specifying the ingress/egress
block when creating clsact, but that typically isn't a problem because qdiscs
for shared blocks would be set up together prior to the attachment anyway.

> I think he said priority is supported but was also originally on that
> list.
> When we discussed at the meetup it didnt seem these cost anything
> in terms of code complexity or usability of the API.
>
> 1) We use non-DA mode, so i cant live without that (and frankly ebpf
> has challenges adding complex code blocks).
>
> 2) We also use different protocols when i need to
> (yes, you can do the filtering in the bpf code - but why impose that
> if the cost of adding it is simple? and of course it is cheaper to do
> the check outside of ebpf)
> 3) We use chains outside of zero
>
> 4) So far we dont use block support but certainly my recent experiences
> in a deployment shows that we need to group netdevices more often than
> i thought was necessary. So if i could express one map shared by
> multiple netdevices it should cut down the user space complexity.
>
> cheers,
> jamal

--
Kartikeya