This series converts filesystem capabilities from passing around raw xattr data to using a kernel-internal representation with type safe uids, similar to the conversion done previously for posix ACLs. Currently fscaps representations in the kernel have two different instances of unclear or confused types: - fscaps are generally passed around in the raw xattr form, with the rootid sometimes containing the user uid value and at other times containing the filesystem value. - The existing kernel-internal representation of fscaps, cpu_vfs_cap_data, uses the kuid_t type, but the value stored is actually a vfsuid. This series eliminates this confusion by converting the xattr data to the kernel representation near the userspace and filesystem boundaries, using the kernel representation within the vfs and commoncap code. The internal representation is renamed to vfs_caps to reflect this broader use, and the rootid is changed to a vfsuid_t to correctly identify the type of uid which it contains. New vfs interfaces are added to allow for getting and setting fscaps using the kernel representation. This requires the addition of new inode operations to allow overlayfs to handle fscaps properly; all other filesystems fall back to a generic implementation. The top-level vfs xattr interfaces will now reject fscaps xattrs, though the lower-level interfaces continue to accept them for reading and writing the raw xattr data. Based on previous feedback, new security hooks are added for fscaps operations. These are really only needed for EVM, and the selinux and smack implementations just peform the same operations that the equivalent xattr hooks would have done. Note too that this has not yet been updated based on the changes to make EVM into an LSM. The remainder of the changes are preparatory work, addition of helpers for converting between the xattr and kernel fscaps representation, and various updates to use the kernel representation and new interfaces. I have tested this code with xfstests, ltp, libcap2, and libcap-ng with no regressions found. To: Christian Brauner <brauner@xxxxxxxxxx> To: Serge Hallyn <serge@xxxxxxxxxx> To: Paul Moore <paul@xxxxxxxxxxxxxx> To: Eric Paris <eparis@xxxxxxxxxx> To: James Morris <jmorris@xxxxxxxxx> To: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> To: Miklos Szeredi <miklos@xxxxxxxxxx> To: Amir Goldstein <amir73il@xxxxxxxxx> Cc: <linux-kernel@xxxxxxxxxxxxxxx> Cc: <linux-fsdevel@xxxxxxxxxxxxxxx> Cc: <linux-security-module@xxxxxxxxxxxxxxx> Cc: <audit@xxxxxxxxxxxxxxx> Cc: <linux-unionfs@xxxxxxxxxxxxxxx> Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@xxxxxxxxxx> --- Changes in v2: - Documented new inode operations in Documentation/filesystems/{vfs,locking}.rst. - Changed types for sizes in function arguments and return values to size_t/ssize_t. - Renamed flags arguments to setxattr_flags for clarity. - Removed memory allocation when reading fscaps xattrs. - Updated get_vfs_caps_from_disk() to use vfs_get_fscaps() and updated comments to explain how these functions are different. - Updates/fixes to kernel-doc comments. - Remove unnecessary type cast. - Rename __vfs_{get,remove}_fscaps() to vfs_{get,remove}_fscaps_nosec(). - Add missing fsnotify_xattr() call in vfs_set_fscaps(). - Add fscaps security hooks along with appropriate handlers in selinux, smack, and evm. - Remove remove_fscaps inode op in favor of passing NULL to set_fscaps. - Added static asserts for compatibility of vfs_cap_data and vfs_ns_cap_data. - ovl: remove unnecessary check around ovl_copy_up(), and add check before copyint up fscaps for removal that the fscaps actually exist on the lower inode. - ovl: install fscaps handlers for all inode types - Add is_fscaps_xattr() helper and use it in place of open-coded strcmps - Link to v1: https://lore.kernel.org/r/20231129-idmap-fscap-refactor-v1-0-da5a26058a5b@xxxxxxxxxx --- Seth Forshee (DigitalOcean) (25): mnt_idmapping: split out core vfs[ug]id_t definitions into vfsid.h mnt_idmapping: include cred.h capability: add static asserts for comapatibility of vfs_cap_data and vfs_ns_cap_data capability: rename cpu_vfs_cap_data to vfs_caps capability: use vfsuid_t for vfs_caps rootids capability: provide helpers for converting between xattrs and vfs_caps capability: provide a helper for converting vfs_caps to xattr for userspace xattr: add is_fscaps_xattr() helper commoncap: use is_fscaps_xattr() xattr: use is_fscaps_xattr() security: add hooks for set/get/remove of fscaps selinux: add hooks for fscaps operations smack: add hooks for fscaps operations evm: add support for fscaps security hooks security: call evm fscaps hooks from generic security hooks fs: add inode operations to get/set/remove fscaps fs: add vfs_get_fscaps() fs: add vfs_set_fscaps() fs: add vfs_remove_fscaps() ovl: add fscaps handlers ovl: use vfs_{get,set}_fscaps() for copy-up fs: use vfs interfaces for capabilities xattrs commoncap: remove cap_inode_getsecurity() commoncap: use vfs fscaps interfaces vfs: return -EOPNOTSUPP for fscaps from vfs_*xattr() Documentation/filesystems/locking.rst | 4 + Documentation/filesystems/vfs.rst | 17 ++ MAINTAINERS | 1 + fs/overlayfs/copy_up.c | 72 ++--- fs/overlayfs/dir.c | 2 + fs/overlayfs/inode.c | 72 +++++ fs/overlayfs/overlayfs.h | 5 + fs/xattr.c | 280 +++++++++++++++++- include/linux/capability.h | 23 +- include/linux/evm.h | 39 +++ include/linux/fs.h | 12 + include/linux/lsm_hook_defs.h | 7 + include/linux/mnt_idmapping.h | 67 +---- include/linux/security.h | 38 ++- include/linux/vfsid.h | 74 +++++ include/linux/xattr.h | 5 + include/uapi/linux/capability.h | 13 + kernel/auditsc.c | 9 +- security/commoncap.c | 529 ++++++++++++++++++---------------- security/integrity/evm/evm_main.c | 60 ++++ security/security.c | 80 +++++ security/selinux/hooks.c | 26 ++ security/smack/smack_lsm.c | 71 +++++ 23 files changed, 1144 insertions(+), 362 deletions(-) --- base-commit: 841c35169323cd833294798e58b9bf63fa4fa1de change-id: 20230512-idmap-fscap-refactor-63b61fa0a36f Best regards, -- Seth Forshee (DigitalOcean) <sforshee@xxxxxxxxxx>