Re: [RFC PATCH v2] hostfs: handle idmapped mounts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/4/23 12:01, Christian Brauner wrote:
On Sat, Mar 04, 2023 at 12:28:46AM -0600, Glenn Washburn wrote:
On Thu, 2 Mar 2023 09:39:28 +0100
Christian Brauner <brauner@xxxxxxxxxx> wrote:

On Tue, Feb 28, 2023 at 07:50:02PM -0600, Glenn Washburn wrote:
Let hostfs handle idmapped mounts. This allows to have the same
hostfs mount appear in multiple locations with different id
mappings.

root@(none):/media# id
uid=0(root) gid=0(root) groups=0(root)
root@(none):/media# mkdir mnt idmapped
root@(none):/media# mount -thostfs -o/home/user hostfs mnt

root@(none):/media# touch mnt/aaa
root@(none):/media# mount-idmapped --map-mount u:`id -u user`:0:1
--map-mount g:`id -g user`:0:1 /media/mnt /media/idmapped
root@(none):/media# ls -l mnt/aaa idmapped/aaa -rw-r--r-- 1 root
root 0 Jan 28 01:23 idmapped/aaa -rw-r--r-- 1 user user 0 Jan 28
01:23 mnt/aaa

root@(none):/media# touch idmapped/bbb
root@(none):/media# ls -l mnt/bbb idmapped/bbb
-rw-r--r-- 1 root root 0 Jan 28 01:26 idmapped/bbb
-rw-r--r-- 1 user user 0 Jan 28 01:26 mnt/bbb

Signed-off-by: Glenn Washburn <development@xxxxxxxxxxxxxxx>
---
Changes from v1:
  * Rebase on to tip. The above commands work and have the results
expected. The __vfsuid_val(make_vfsuid(...)) seems ugly to get the
uid_t, but it seemed like the best one I've come across. Is there a
better way?

Sure, I can help you with that. ;)

Thank you!


Glenn
---
  fs/hostfs/hostfs_kern.c | 13 +++++++------
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index c18bb50c31b6..9459da99a0db 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -786,7 +786,7 @@ static int hostfs_permission(struct mnt_idmap
*idmap, err = access_file(name, r, w, x);
  	__putname(name);
  	if (!err)
-		err = generic_permission(&nop_mnt_idmap, ino,
desired);
+		err = generic_permission(idmap, ino, desired);
  	return err;
  }
@@ -794,13 +794,14 @@ static int hostfs_setattr(struct mnt_idmap
*idmap, struct dentry *dentry, struct iattr *attr)
  {
  	struct inode *inode = d_inode(dentry);
+	struct user_namespace *fs_userns = i_user_ns(inode);

Fyi, since hostfs can't be mounted in a user namespace
fs_userns == &init_user_ns
so it doesn't really matter what you use.

What would you suggest as preferable?

I would leave init_user_ns hardcoded as it clearly indicates that hostfs
can only be mounted in the initial user namespace. Plus, the patch is
smaller.


  	struct hostfs_iattr attrs;
  	char *name;
  	int err;
int fd = HOSTFS_I(inode)->fd; - err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
+	err = setattr_prepare(idmap, dentry, attr);
  	if (err)
  		return err;
@@ -814,11 +815,11 @@ static int hostfs_setattr(struct mnt_idmap
*idmap, }
  	if (attr->ia_valid & ATTR_UID) {
  		attrs.ia_valid |= HOSTFS_ATTR_UID;
-		attrs.ia_uid = from_kuid(&init_user_ns,
attr->ia_uid);
+		attrs.ia_uid = __vfsuid_val(make_vfsuid(idmap,
fs_userns, attr->ia_uid)); }
  	if (attr->ia_valid & ATTR_GID) {
  		attrs.ia_valid |= HOSTFS_ATTR_GID;
-		attrs.ia_gid = from_kgid(&init_user_ns,
attr->ia_gid);
+		attrs.ia_gid = __vfsgid_val(make_vfsgid(idmap,
fs_userns, attr->ia_gid));

Heh, if you look include/linux/fs.h:

         /*
          * The two anonymous unions wrap structures with the same
member. *
          * Filesystems raising FS_ALLOW_IDMAP need to use
ia_vfs{g,u}id which
          * are a dedicated type requiring the filesystem to use the
dedicated
          * helpers. Other filesystem can continue to use ia_{g,u}id
until they
          * have been ported.
          *
          * They always contain the same value. In other words
FS_ALLOW_IDMAP
          * pass down the same value on idmapped mounts as they would
on regular
          * mounts.
          */
         union {
                 kuid_t          ia_uid;
                 vfsuid_t        ia_vfsuid;
         };
         union {
                 kgid_t          ia_gid;
                 vfsgid_t        ia_vfsgid;
         };

this just is:

attrs.ia_uid = from_vfsuid(idmap, fs_userns, attr->ia_vfsuid));
attrs.ia_gid = from_vfsgid(idmap, fs_userns, attr->ia_vfsgid));

Its easy to miss from this patch because of lack of context, but attrs
is a struct hostfs_iattr, not struct iattr. And attrs.ia_uid is of type
uid_t, not kuid_t. So the above fails to compile. This is why I needed

Oh, I see. And then that raw value is used by calling
fchmod()/chmod()/chown()/fchown() and so on. That's rather special.
Ok, then I know what to do.

to wrap make_vfsuid() in __vfsuid_val() (to get the uid_t).

Right. My point had been - independent of the struct hostfs_iattr issue
you thankfully pointed out - that make_vfsuid() is wrong here.

make_vfsuid() is used to map a filesystem wide k{g,u}id_t according to
the mount's idmapping that operation originated from. But that's done
by the vfs way before we're calling into the filesystem. For example,
it's done in chown_common().

So the value placed in struct iattr (the VFS struct) is already a
vfs{g,u}id stored in iattr->ia_vfs{g,u}id. So you need to use
from_vfs{g,u}id() here.


I had decided against using from_vfsuid() because then I thought I'd
need to use from_kuid() to get the uid_t. And from_kuid() takes the
namespace (again), which seemed uglier.

Based on this, what do you suggest?

Ok, so just some details on the background before I paste what I think
we should do.
As soon as you support idmapped mounts you at least technically are
Thanks for the detailed explanation. I apologize for not getting back to this sooner.

always dealing with two mappings:

(1) First, there's the filesystem wide idmapping which is taken from the
     namespace the filessytem was mounted in. This idmapping is applied
     when you read the raw uid/gid value from disk and turn into a kuid_t
     type. That value is persistent and stored in inode->i_{g,u}id. All
     things that are cached and that can be accessed from multiple mounts
     concurrently can only ever cache k{g,u}id_t aka filesystem values.
(2) Whenever we're dealing with an operation that's coming from an
     idmapped mount we need to take the idmapping of the mount into
     account. That idmapping is completely separate type struct
     mnt_idmap that's opaque for filesystems and most of the vfs.

     That idmapping is used to generate the vfs{g,u}id_t. IOW, translates
     from the filesystem representation to a mount/vfs representation.

So, in order to store the correct value on disk we need to invert those
two idmappings to arrive at the raw value that we want to store:
(U1) from_vfsuid() // map to the filesystem wide value aka something
      that we can store in inode->i_{g,u}id and that's cacheable. This is
      done in setattr_copy().
(U2) from_kuid() // map the filesystem wide value to the raw value we
      want to store on disk

It seems to me that there are actually 3 mappings, with the third being (U2) above (ie vfsuid_t -> kuid_t). And that from_vfsuid() does mappings (1) and (2) above. Is this incorrect?

Whats confusing to me is that from_vfsuid() takes both an idmap and a user namespace, so presumably it will handle both mapping types (1) and (2). And then there's from_kuid() which takes an idmap, so I thought it might also do a type (2) mapping. But looking at the code it doesn't seem to ever use its idmap parameter. Can you explain the rational behind having from_kuid() take an idmap? Is it legacy that will be cleaned up as this code settles down / stabilizes? Or perhaps its


For nearly all filesystems these steps almost never need to be performed
explicitly. Instead, dedicated vfs helpers will do this:

(U1) i_{g,u}id_update() // map to filesystem wide value
(U2) i_{g,u}id_read() // map to raw on-disk value

For filesystems that don't support being mounted in namespaces the (U2)
step is always a nop. So technically there's no difference between:

(U2) from_kuid() and __kuid_val(kuid)

but it's cleaner to use the helpers even in that case.

So given how hostfs works these two steps need to be performed
explicitly. So I suggest (untested):

diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index c18bb50c31b6..72b7e1bcc32e 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -813,12 +813,22 @@ static int hostfs_setattr(struct mnt_idmap *idmap,
                 attrs.ia_mode = attr->ia_mode;
         }
         if (attr->ia_valid & ATTR_UID) {
+               kuid_t kuid;
+
                 attrs.ia_valid |= HOSTFS_ATTR_UID;
-               attrs.ia_uid = from_kuid(&init_user_ns, attr->ia_uid);
+               /* Map the vfs id into the filesystem. */
+               kuid = from_vfsuid(idmap, &init_user_ns, attr->ia_vfsuid);
+               /* Map the filesystem id to its raw on disk value. */
+               attrs.ia_uid = from_kuid(&init_user_ns, kuid);

Its interesting that this is what I originally discarded, as an unfamiliar reader, it looks like you're doing two namespace mappings. But that's not happening because from_kuid() disregards its namespace parameter.

I've tested this and it does seems to work. Thanks!

Glenn

         }
         if (attr->ia_valid & ATTR_GID) {
+               kgid_t kgid;
+
                 attrs.ia_valid |= HOSTFS_ATTR_GID;
-               attrs.ia_gid = from_kgid(&init_user_ns, attr->ia_gid);
+               /* Map the vfs id into the filesystem. */
+               kgid = from_vfsgid(idmap, &init_user_ns, attr->ia_vfsgid);
+               /* Map the filesystem id to its raw on disk value. */
+               attrs.ia_gid = from_kgid(&init_user_ns, kgid);
         }
         if (attr->ia_valid & ATTR_SIZE) {
                 attrs.ia_valid |= HOSTFS_ATTR_SIZE;




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux