Re: [PATCH bpf-next 01/10] bpf: Add initial fd-based API to attach tc BPF programs

Jamal Hadi Salim <jhs@xxxxxxxxxxxx> · Thu, 6 Oct 2022 10:40:52 -0400

On Thu, Oct 6, 2022 at 1:01 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Wed, Oct 05, 2022 at 01:11:34AM +0200, Daniel Borkmann wrote:

>
> I cannot help but feel that prio logic copy-paste from old tc, netfilter and friends
> is done because "that's how things were done in the past".
> imo it was a well intentioned mistake and all networking things (tc, netfilter, etc)
> copy-pasted that cumbersome and hard to use concept.
> Let's throw away that baggage?
> In good set of cases the bpf prog inserter cares whether the prog is first or not.
> Since the first prog returning anything but TC_NEXT will be final.
> I think prog insertion flags: 'I want to run first' vs 'I don't care about order'
> is good enough in practice. Any complex scheme should probably be programmable
> as any policy should. For example in Meta we have 'xdp chainer' logic that is similar
> to libxdp chaining, but we added a feature that allows a prog to jump over another
> prog and continue the chain. Priority concept cannot express that.
> Since we'd have to add some "policy program" anyway for use cases like this
> let's keep things as simple as possible?
> Then maybe we can adopt this "as-simple-as-possible" to XDP hooks ?
> And allow bpf progs chaining in the kernel with "run_me_first" vs "run_me_anywhere"
> in both tcx and xdp ?

You just described the features already offered by tc opcodes + priority.

This problem is solvable by some user space resource arbitration scheme.
Reading through the thread - a daemon of some sort will do. A daemon
which issues tokens that can be validated in the kernel (kerberos type
of approach) would be the best i.e fds alone dont resolve this.

cheers,
jamal