On 15.08.23 10:59, Christian Brauner wrote: > On Mon, Aug 14, 2023 at 04:26:09PM +0200, Michael Weiß wrote: >> Introduce the BPF_F_CGROUP_DEVICE_GUARD flag for BPF_PROG_LOAD >> which allows to set a cgroup device program to be a device guard. > > Currently we block access to devices unconditionally in may_open_dev(). > Anything that's mounted by an unprivileged containers will get > SB_I_NODEV set in s_i_flags. > > Then we currently mediate device access in: > > * inode_permission() > -> devcgroup_inode_permission() > * vfs_mknod() > -> devcgroup_inode_mknod() > * blkdev_get_by_dev() // sget()/sget_fc(), other ways to open block devices and friends > -> devcgroup_check_permission() > * drivers/gpu/drm/amd/amdkfd // weird restrictions on showing gpu info afaict > -> devcgroup_check_permission() > > All your new flag does is to bypass that SB_I_NODEV check afaict and let > it proceed to the devcgroup_*() checks for the vfs layer. Yes. In an early version, I had the check in super.c to avoid setting the SB_I_NODEV on mount. I thought it would be a less invasive change to do both checks in one source file. But from an architecture point of view it would be better that we do it there. Should we? > > But I don't get the semantics yet. > Is that a flag which is set on BPF_PROG_TYPE_CGROUP_DEVICE programs or > is that a flag on random bpf programs? It looks like it would be the > latter but design-wise I would expect this to be a property of the > device program itself. Yes it's a flag on the bpf program which could be set during BPF_PROG_LOAD. This was straight forward to be implemented similarly to the BPF_F_XDP_* flags. Cheers, Michael