Re: [PATCH v2 bpf-next 00/18] BPF token

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 22 Jun 2023 16:35:02 -0700

On Thu, Jun 22, 2023 at 2:04 PM Maryam Tahhan <mtahhan@xxxxxxxxxx> wrote:
>
> On Thu, Jun 22, 2023 at 7:40 PM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Thu, Jun 22, 2023 at 10:38 AM Maryam Tahhan <mtahhan@xxxxxxxxxx> wrote:
> > >
> >
> > Please avoid replying in HTML.
> >
>
> Sorry.

No worries, the problem is that the mailing list filters out such
messages. So if you go to [0] and scroll to the bottom of the page,
you'll see that your email is not in the lore archive. People not
CC'ed directly will only see what you wrote through my reply quoting
your email.

  [0] https://lore.kernel.org/bpf/CAFdtZitYhOK4TzAJVbFPMfup_homxSSu3Q8zjJCCiHCf22eJvQ@xxxxxxxxxxxxxx/#t

>
> [...]
>
> >
> > Disclaimer: I don't know anything about Kubernetes, so don't expect me
> > reply with correct terminology or detailed understanding of
> > configuration of containers.
> >
> > But on a more generic and conceptual level, it seems like you are
> > making some implementation assumptions and arguing based on that.
> >
>
> Firstly, thank you for taking the time to respond and explain. I can see
> where you are coming from.
>
> Yeah, admittedly I did make a few assumptions. I was thrown by the reference
> to `unprivileged` processes in the cover letter. It seems like this is a way to
> grant namespaced BPF permissions to a process (my gross
> oversimplification - sorry).

Yep, with the caveat that BPF functionality itself cannot be
namespaced (i.e., contained within the container), so this has to be
granted by a fully privileged process/proxy based on trusting the
workload to not do anything harmful.

> Looking back throughout your responses there's nothing unprivileged here.
>
> [...]
>
>
> > Hopefully you can see where I'm going with this. And this is just one
> > random tiny example. We can think up tons of other cases to prove BPF
> > is not isolatable to any sort of "container".
> >
> > >
> > > Anyway - I hope this clarifies my original intent - which is proxy at least starts to solve one part of the puzzle. Whatever approach(es) we take to solve the rest of these problems the more we can stick to tried and trusted mechanisms the better.
> >
> > I disagree. BPF proxy complicates logistics, operations, and developer
> > experience, without resolving the issue of determining trust and the
> > need to delegate or proxy BPF functionality.
>
> I appreciate your viewpoint. I just don't think that this is a one
> solution fits every
> scenario situation.

Absolutely. It's also not my intent or goal to kill any sort of BPF
proxy. What I'm trying to convey is that the BPF proxy approach has
severe downsides, depending on application, deployment practices, etc,
etc. It's not always a (good) answer. So I just want to avoid having
the dichotomy of "BPF token or BPF proxy, there could be only one".

> For example in the case of AF_XDP, I'd like to be
> able to run
> my containers without any additional privileges. I've been working on a device
> plugin for Kubernetes whose job is to provision netdevs with an XDP redirect
> program (then later there's a CNI that moves the netdev into the pod network
> namespace).  Originally I was using bpf locally in the device plugin
> (to load the
> bpf program and get the XSK map fd) and SCM rights to pass the XSK_MAP over
> UDS but honestly it was relatively cumbersome from an app development POV, very
> easy to get wrong, and trying to keep up with the latest bpf api
> changes started to
> become an issue. If I wanted to add more interesting bpf programs I
> had to do a full
> recompile...
>
> I've now moved to using bpfd, for the loading and unloading of the bpf
> program on my behalf,
> it also comes with a bunch of other advantages including being able to
> update my trusted bpf
> program transparently to both the device plugin my application (I
> don't have to respin this either
> when I write/want to add a new bpf prog), but mainly I have a trusted
> proxy managing bpffs, bpf progs and maps for me. There's still more
> work to do here...
>

It's a spectrum, and from my observations networking BPF programs lend
themselves more easily to this model of BPF proxy (at least until they
become complicated ensembles of networking and tracing BPF programs).
Very often networking applications can indeed load BPF program
completely independently from user-space parts, keep them "persisted"
in kernel, occasionally control them through pinned BPF maps, etc.

But the further you go towards tracing applications where BPF parts
are integral part of overall user-space application, this model
doesn't work very well. It's much simple to have BPF parts embedded,
loaded, versioned, initialized and interacted with from inside the
same process. And we have lots of such applications. BPF proxy
approach is a massive complication for such use cases with a bunch of
downsides.

> I understand this is a much simplified scenario. and I'm sure I can
> think of several more where
> proxy is useful. All I'm trying to say is, I'm not sure there's just a
> one size fits all soln for these issues.

100% agree. BPF token won't fit all use cases. And BPF proxy won't fit
all use cases either. Both approaches can and should coexist.

>
> Thanks
> Maryam
>