On 25.10.23 15:17, Paul Moore wrote: > On Wed, Oct 25, 2023 at 5:42 AM Michael Weiß > <michael.weiss@xxxxxxxxxxxxxxxxxxx> wrote: >> >> Introduce the flag BPF_DEVCG_ACC_MKNOD_UNS for bpf programs of type >> BPF_PROG_TYPE_CGROUP_DEVICE which allows to guard access to mknod >> in non-initial user namespaces. >> >> If a container manager restricts its unprivileged (user namespaced) >> children by a device cgroup, it is not necessary to deny mknod() >> anymore. Thus, user space applications may map devices on different >> locations in the file system by using mknod() inside the container. >> >> A use case for this, we also use in GyroidOS, is to run virsh for >> VMs inside an unprivileged container. virsh creates device nodes, >> e.g., "/var/run/libvirt/qemu/11-fgfg.dev/null" which currently fails >> in a non-initial userns, even if a cgroup device white list with the >> corresponding major, minor of /dev/null exists. Thus, in this case >> the usual bind mounts or pre populated device nodes under /dev are >> not sufficient. >> >> To circumvent this limitation, allow mknod() by checking CAP_MKNOD >> in the userns by implementing the security_inode_mknod_nscap(). The >> hook implementation checks if the corresponding permission flag >> BPF_DEVCG_ACC_MKNOD_UNS is set for the device in the bpf program. >> To avoid to create unusable inodes in user space the hook also >> checks SB_I_NODEV on the corresponding super block. >> >> Further, the security_sb_alloc_userns() hook is implemented using >> cgroup_bpf_current_enabled() to allow usage of device nodes on super >> blocks mounted by a guarded task. >> >> Patch 1 to 3 rework the current devcgroup_inode hooks as an LSM >> >> Patch 4 to 8 rework explicit calls to devcgroup_check_permission >> also as LSM hooks and finalize the conversion of the device_cgroup >> subsystem to a LSM. >> >> Patch 9 and 10 introduce new generic security hooks to be used >> for the actual mknod device guard implementation. >> >> Patch 11 wires up the security hooks in the vfs >> >> Patch 12 and 13 provide helper functions in the bpf cgroup >> subsystem. >> >> Patch 14 finally implement the LSM hooks to grand access >> >> Signed-off-by: Michael Weiß <michael.weiss@xxxxxxxxxxxxxxxxxxx> >> --- >> Changes in v2: >> - Integrate this as LSM (Christian, Paul) >> - Switched to a device cgroup specific flag instead of a generic >> bpf program flag (Christian) >> - do not ignore SB_I_NODEV in fs/namei.c but use LSM hook in >> sb_alloc_super in fs/super.c >> - Link to v1: https://lore.kernel.org/r/20230814-devcg_guard-v1-0-654971ab88b1@xxxxxxxxxxxxxxxxxxx >> >> Michael Weiß (14): >> device_cgroup: Implement devcgroup hooks as lsm security hooks >> vfs: Remove explicit devcgroup_inode calls >> device_cgroup: Remove explicit devcgroup_inode hooks >> lsm: Add security_dev_permission() hook >> device_cgroup: Implement dev_permission() hook >> block: Switch from devcgroup_check_permission to security hook >> drm/amdkfd: Switch from devcgroup_check_permission to security hook >> device_cgroup: Hide devcgroup functionality completely in lsm >> lsm: Add security_inode_mknod_nscap() hook >> lsm: Add security_sb_alloc_userns() hook >> vfs: Wire up security hooks for lsm-based device guard in userns >> bpf: Add flag BPF_DEVCG_ACC_MKNOD_UNS for device access >> bpf: cgroup: Introduce helper cgroup_bpf_current_enabled() >> device_cgroup: Allow mknod in non-initial userns if guarded >> >> block/bdev.c | 9 +- >> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 +- >> fs/namei.c | 24 ++-- >> fs/super.c | 6 +- >> include/linux/bpf-cgroup.h | 2 + >> include/linux/device_cgroup.h | 67 ----------- >> include/linux/lsm_hook_defs.h | 4 + >> include/linux/security.h | 18 +++ >> include/uapi/linux/bpf.h | 1 + >> init/Kconfig | 4 + >> kernel/bpf/cgroup.c | 14 +++ >> security/Kconfig | 1 + >> security/Makefile | 2 +- >> security/device_cgroup/Kconfig | 7 ++ >> security/device_cgroup/Makefile | 4 + >> security/{ => device_cgroup}/device_cgroup.c | 3 +- >> security/device_cgroup/device_cgroup.h | 20 ++++ >> security/device_cgroup/lsm.c | 114 +++++++++++++++++++ >> security/security.c | 75 ++++++++++++ >> 19 files changed, 294 insertions(+), 88 deletions(-) >> delete mode 100644 include/linux/device_cgroup.h >> create mode 100644 security/device_cgroup/Kconfig >> create mode 100644 security/device_cgroup/Makefile >> rename security/{ => device_cgroup}/device_cgroup.c (99%) >> create mode 100644 security/device_cgroup/device_cgroup.h >> create mode 100644 security/device_cgroup/lsm.c > > Hi Michael, > > I think this was lost because it wasn't CC'd to the LSM list (see > below). I've CC'd the list on my reply, but future patch submissions > that involve the LSM must be posted to the LSM list if you would like > them to be considered. > > http://vger.kernel.org/vger-lists.html#linux-security-module > Hi Paul, thanks, I'll keep this in mind for the next submissions. I have also resend because, I realized that some spam filters my have swallowed the last submission as I used my private smtp server from another domain in the gitconfig. Sorry for that. I hope now every one received it. Thanks, Michael