Re: [PATCH RFC 1/4] bpf: add cgroup device guard to flag a cgroup device prog

Michael Weiß <michael.weiss@xxxxxxxxxxxxxxxxxxx> · Thu, 17 Aug 2023 17:47:07 +0200

On 15.08.23 10:59, Christian Brauner wrote:
> On Mon, Aug 14, 2023 at 04:26:09PM +0200, Michael Weiß wrote:
>> Introduce the BPF_F_CGROUP_DEVICE_GUARD flag for BPF_PROG_LOAD
>> which allows to set a cgroup device program to be a device guard.
> 
> Currently we block access to devices unconditionally in may_open_dev().
> Anything that's mounted by an unprivileged containers will get
> SB_I_NODEV set in s_i_flags.
> 
> Then we currently mediate device access in:
> 
> * inode_permission()
>   -> devcgroup_inode_permission()
> * vfs_mknod()
>   -> devcgroup_inode_mknod()
> * blkdev_get_by_dev() // sget()/sget_fc(), other ways to open block devices and friends
>   -> devcgroup_check_permission()
> * drivers/gpu/drm/amd/amdkfd // weird restrictions on showing gpu info afaict
>   -> devcgroup_check_permission()
> 
> All your new flag does is to bypass that SB_I_NODEV check afaict and let
> it proceed to the devcgroup_*() checks for the vfs layer.

Yes. In an early version, I had the check in super.c to avoid setting the
SB_I_NODEV on mount. I thought it would be a less invasive change to do both
checks in one source file. But from an architecture point of view it would be
better that we do it there. Should we?

> 
> But I don't get the semantics yet.
> Is that a flag which is set on BPF_PROG_TYPE_CGROUP_DEVICE programs or
> is that a flag on random bpf programs? It looks like it would be the
> latter but design-wise I would expect this to be a property of the
> device program itself.

Yes it's a flag on the bpf program which could be set during BPF_PROG_LOAD.
This was straight forward to be implemented similarly to the BPF_F_XDP_*
flags.

Cheers,
Michael