Re: [PATCH v2 bpf-next 00/18] BPF token

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/16/23 12:48 AM, Andrii Nakryiko wrote:
On Wed, Jun 14, 2023 at 2:39 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
On Wed, Jun 14, 2023 at 02:23:02AM +0200, Djalal Harouni wrote:
On Tue, Jun 13, 2023 at 12:27 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
On Mon, Jun 12, 2023 at 5:02 AM Djalal Harouni <tixxdz@xxxxxxxxx> wrote:
On Sat, Jun 10, 2023 at 12:57 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
On Fri, Jun 9, 2023 at 3:30 PM Djalal Harouni <tixxdz@xxxxxxxxx> wrote:

Hi Andrii,

On Thu, Jun 8, 2023 at 1:54 AM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote:

...
creating new BPF objects like BPF programs, BPF maps, etc.

Is there a reason for coupling this only with the userns?

There is no coupling. Without userns it is at least possible to grant
CAP_BPF and other capabilities from init ns. With user namespace that
becomes impossible.

But these are not the same: delegate full cap vs delegate an fd mask?

What FD mask are we talking about here? I don't recall us talking
about any FD masks, so this one is a bit confusing without more
context.

Ah err, sorry yes referring to fd token (which I assumed is a mask of
allowed operations or something like that).

So I want the possibility to delegate the fd token in the init userns.


One can argue unprivileged in init userns is the same privileged in
nested userns
Getting to delegate fd in init userns, then in nested ones seems logical...

Again, sorry, I'm not following. Can you please elaborate what you mean?

I mean can we use the fd token in the init user namespace too? not
only in the nested user namespaces but in the first one? Sorry I
didn't check the code.


[...]


Having the fd or "token" that gives access rights pinned in two
separate bpffs mounts seems too much, it crosses namespaces (mount,
userns etc), environments setup by privileged...

See above, there is nothing namespaceable about BPF itself, and BPF
token as well. If some production setup benefits from pinning one BPF
token in multiple places, I don't see the problem with that.


I would just make it per bpffs mount and that's it, nothing more. If a
program wants to bind mount it somewhere else then it's not a bpf
problem.

And if some application wants to pin BPF token, why would that be BPF
subsystem's problem as well?

The credentials, capabilities, keyring, different namespaces, etc are
all attached to the owning user namespace, if the BPF subsystem goes
its own way and creates a token to split up CAP_BPF without following
that model, then it's definitely a BPF subsystem problem...  I don't
recommend that.

Feels it's going more of a system-wide approach opening BPF
functionality where ultimately it clashes with the argument: delegate
a subset of BPF functionality to a *trusted* unprivileged application.
My reading of delegation is within a container/service hierarchy
nothing more.

You're making the exact arguments that Lennart, Aleksa, and I have been
making in the LSFMM presentation about this topic. It's even recorded:

Alright, so (I think) I get a pretty good feel now for what the main
concerns are, and why people are trying to push this to be an FS. And
it's not so much that BPF token grants bpf() syscall usage to unpriv
(but trusted) workloads or that BPF itself is not namespaceable. The
main worry is that BPF token, once issues, could be
illegally/uncontrollably passed outside of container, intentionally or
not. And by having this association with mount namespace (through BPF
FS) we automatically limit the sharing to only contain that has access
to that BPF FS.

+1

So I agree that it makes sense to have this mount namespace
association, but I also would like to keep BPF token to be a separate
entity from BPF FS itself, and have the ability to have multiple
different BPF tokens exposed in a single BPF FS instance. I think the
latter is important.

So how about this slight modification: when a BPF token is created
using BPF_TOKEN_CREATE command, the user has to provide an FD for
"associated" BPF FS instance (superblock). What that does is allows
BPF token to be created with BPF FS and/or mount namespace association
set in stone. After that BPF token can only be pinned in that BPF FS
instance and cannot leave the boundaries of that mount namespace
(specific details to be worked out, this is new area for me, so I'm
sorry if I'm missing nuances).

Given bpffs is not a singleton and there can be multiple bpffs instances
in a container, couldn't we make the token a special bpffs mount/mode?
Something like single .token file in that mount (for example) which can
be opened and the fd then passed along for prog/map creation? And given
the multiple mounts, this also allows potentially for multiple tokens?
In other words, this is already set up by the container manager when it
sets up mounts rather than later, and the regular bpffs instance is sth
separate from all that. Meaning, in your container you get the usual
bpffs instance and then one or more special bpffs instances as tokens
at different paths (and in future they could unlock different subset of
bpf functionality for example).

Thanks,
Daniel




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux