Re: [PATCH v1 0/9] fuse: basic support for idmapped mounts

Christian Brauner <brauner@xxxxxxxxxx> · Sun, 21 Jan 2024 18:50:57 +0100

On Mon, Jan 08, 2024 at 01:08:15PM +0100, Alexander Mikhalitsyn wrote:
> Dear friends,
> 
> This patch series aimed to provide support for idmapped mounts
> for fuse. We already have idmapped mounts support for almost all
> widely-used filesystems:
> * local (ext4, btrfs, xfs, fat, vfat, ntfs3, squashfs, f2fs, erofs, ZFS (out-of-tree))
> * network (ceph)
> 
> Git tree (based on https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git/log/?h=for-next):
> v1: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts.v1
> current: https://github.com/mihalicyn/linux/commits/fuse_idmapped_mounts

Great work!

> Things to discuss:
> - we enable idmapped mounts support only if "default_permissions" mode is enabled,
> because otherwise, we would need to deal with UID/GID mappings on the userspace side OR
> provide the userspace with idmapped req->in.h.uid/req->in.h.gid values which is not
> something that we probably want to do. Idmapped mounts philosophy is not about faking
> caller uid/gid.

Having VFS idmaps but then outsourcing permission checking to userspace
is conceptually strange so requiring default_permissions is the correct
thing to do. 

> - We have a small offlist discussion with Christian about adding fs_type->allow_idmap
> hook. Christian pointed out that it would be nice to have a superblock flag instead like
> SB_I_NOIDMAP and we can set this flag during mount time if we see that the filesystem does not
> support idmappings. But, unfortunately, I didn't succeed here because the kernel will
> know if the filesystem supports idmapping or not after FUSE_INIT request, but FUSE_INIT request
> is being sent at the end of the mounting process, so the mount and superblock will exist and
> visible by the userspace in that time. It seems like setting SB_I_NOIDMAP flag, in this
> case, is too late as a user may do the trick by creating an idmapped mount while it wasn't
> restricted by SB_I_NOIDMAP. Alternatively, we can introduce a "positive" version SB_I_ALLOWIDMAP

I see.

> and a "weak" version of FS_ALLOW_IDMAP like FS_MAY_ALLOW_IDMAP. So if FS_MAY_ALLOW_IDMAP is set,
> then SB_I_ALLOWIDMAP has to be set on the superblock to allow the creation of an idmapped mount.
> But that's a matter of our discussion.

I dislike making adding a struct super_block method. Because it means that we
call into the filesystem from generic mount code and specifically with the
namespace semaphore held. If there's ever any network filesystem that e.g.,
calls to a hung server it will lockup the whole system. So I'm opposed to
calling into the filesystem here at all. It's also ugly because this is really
a vfs level change. The only involvement should be whether the filesystem type
can actually support this ideally.

I think we should handle this within FUSE. So we allow the creation of idmapped
mounts just based on FS_ALLOW_IDMAP. And if the server doesn't support the
FUSE_OWNER_UID_GID_EXT then we simply refuse all creation requests originating
from an idmapped mount. Either we return EOPNOSUPP or we return EOVERFLOW to
indicate that we can't represent the owner correctly because the server is
missing the required extension.

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 3f37ba6a7a10..0726da21150a 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -606,8 +606,16 @@ static int get_create_ext(struct mnt_idmap *idmap,
                err = get_security_context(dentry, mode, &ext);
        if (!err && fc->create_supp_group)
                err = get_create_supp_group(dir, &ext);
-       if (!err && fc->owner_uid_gid_ext)
-               err = get_owner_uid_gid(idmap, fc, &ext);
+       if (!err) {
+               /*
+                * If the server doesn't support FUSE_OWNER_UID_GID_EXT and
+                * this is a creation request from an idmapped mount refuse it.
+                */
+               if (fc->owner_uid_gid_ext)
+                       err = get_owner_uid_gid(idmap, fc, &ext);
+               else if (idmap != &nop_mnt_idmap)
+                       err = -EOPNOTSUPP;
+       }

        if (!err && ext.size) {
                WARN_ON(args->in_numargs >= ARRAY_SIZE(args->in_args));