This series converts filesystem capabilities from passing around raw xattr data to using a kernel-internal representation with type safe uids, similar to the conversion done previously for posix ACLs. Currently fscaps representations in the kernel have two different instances of unclear or confused types: - fscaps are generally passed around in the raw xattr form, with the rootid sometimes containing the user uid value and at other times containing the filesystem value. - The existing kernel-internal representation of fscaps, cpu_vfs_cap_data, uses the kuid_t type, but the value stored is actually a vfsuid. This series eliminates this confusion by converting the xattr data to the kernel representation near the userspace and filesystem boundaries, using the kernel representation within the vfs and commoncap code. The internal representation is renamed to vfs_caps to reflect this broader use, and the rootid is changed to a vfsuid_t to correctly identify the type of uid which it contains. New vfs interfaces are added to allow for getting and setting fscaps using the kernel representation. This requires the addition of new inode operations to allow overlayfs to handle fscaps properly; all other filesystems fall back to a generic implementation. The top-level vfs xattr interfaces will now reject fscaps xattrs, though the lower-level interfaces continue to accept them for reading and writing the raw xattr data. The existing xattr security hooks can continue to be used for fscaps. There is some awkwardness here, as EVM requires the on-disk fscaps data to compare with any existing on-disk value. Security checks need to happen before calling into filesystem inode operations, when the fscaps are still in the kernel-internal format, so an extra conversion to the on-disk format is necessary for EVM's setxattr checks. The remainder of the changes are preparatory work and addition of helpers for converting between the xattr and kernel fscaps representation. I have tested this code with xfstests, ltp, libcap2, and libcap-ng with no regressions found. Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@xxxxxxxxxx> --- Seth Forshee (DigitalOcean) (16): mnt_idmapping: split out core vfs[ug]id_t definitions into vfsid.h mnt_idmapping: include cred.h capability: rename cpu_vfs_cap_data to vfs_caps capability: use vfsuid_t for vfs_caps rootids capability: provide helpers for converting between xattrs and vfs_caps capability: provide a helper for converting vfs_caps to xattr for userspace fs: add inode operations to get/set/remove fscaps fs: add vfs_get_fscaps() fs: add vfs_set_fscaps() fs: add vfs_remove_fscaps() ovl: add fscaps handlers ovl: use vfs_{get,set}_fscaps() for copy-up fs: use vfs interfaces for capabilities xattrs commoncap: remove cap_inode_getsecurity() commoncap: use vfs fscaps interfaces for killpriv checks vfs: return -EOPNOTSUPP for fscaps from vfs_*xattr() MAINTAINERS | 1 + fs/overlayfs/copy_up.c | 72 +++--- fs/overlayfs/dir.c | 3 + fs/overlayfs/inode.c | 84 +++++++ fs/overlayfs/overlayfs.h | 6 + fs/xattr.c | 286 ++++++++++++++++++++++- include/linux/capability.h | 23 +- include/linux/fs.h | 13 ++ include/linux/mnt_idmapping.h | 67 +----- include/linux/security.h | 5 +- include/linux/vfsid.h | 74 ++++++ kernel/auditsc.c | 9 +- security/commoncap.c | 519 ++++++++++++++++++++++-------------------- 13 files changed, 802 insertions(+), 360 deletions(-) --- base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab change-id: 20230512-idmap-fscap-refactor-63b61fa0a36f Best regards, -- Seth Forshee (DigitalOcean) <sforshee@xxxxxxxxxx>