Re: [PATCH] xfs: do not clear S_ISUID|S_ISGID for idmapped mounts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/22/22 13:24, Christian Brauner wrote:
On Tue, Feb 22, 2022 at 09:33:40AM +0100, Christoph Hellwig wrote:
On Mon, Feb 21, 2022 at 09:22:18PM +0300, Andrey Zhadchenko wrote:
xfs_fileattr_set() handles idmapped mounts correctly and do not drop this
bits.
Unfortunately chown syscall results in different callstask:
i_op->xfs_vn_setattr()->...->xfs_setattr_nonsize() which checks if process
has CAP_FSETID capable in init_user_ns rather than mntns userns.

Can you add an xfstests the exercises this path?

The fix itself looks good:

Reviewed-by: Christoph Hellwig <hch@xxxxxx>

So for anything other than directories the s{g,u}id bits are cleared on
every chown in notify_change() by the vfs; even for the root user (Also
documented on chown(2) manpage).

Only exception - chown preserves setgid bit set on a non-group-executable file (also documented there) but do not take root privileges into account at vfs level.


So the only scenario were this change would be relevant is for
directories afaict:

1. So ext4 has the behavior:

    ubuntu@f2-vm|~
    > mkdir suid.dir
    ubuntu@f2-vm|~
    > perms ./suid.dir
    drwxrwxr-x 775 (1000:1000) ./suid.dir
    ubuntu@f2-vm|~
    > chmod u+s ./suid.dir/
    ubuntu@f2-vm|~
    > perms ./suid.dir
    drwsrwxr-x 4775 (1000:1000) ./suid.dir
    ubuntu@f2-vm|~
    > chmod g+s ./suid.dir/
    ubuntu@f2-vm|~
    > perms ./suid.dir
    drwsrwsr-x 6775 (1000:1000) ./suid.dir
    ubuntu@f2-vm|~
    > chown 1000:1000 ./suid.dir/
    ubuntu@f2-vm|~
    > perms ./suid.dir/
    drwsrwsr-x 6775 (1000:1000) ./suid.dir/
meaning that both s{g,u}id bits are retained for directories. (Just to
    make this explicit: changing {g,u}id to the same {g,u}id still ends up
    calling into the filesystem.)

2. Whereas xfs currently has:

    brauner@wittgenstein|~
    > mkdir suid.dir
    brauner@wittgenstein|~
    > perms ./suid.dir
    drwxrwxr-x 775 ./suid.dir
    brauner@wittgenstein|~
    > chmod u+s ./suid.dir/
    brauner@wittgenstein|~
    > perms ./suid.dir
    drwsrwxr-x 4775 ./suid.dir
    brauner@wittgenstein|~
    > chmod g+s ./suid.dir/
    brauner@wittgenstein|~
    > perms ./suid.dir
    drwsrwsr-x 6775 ./suid.dir
    brauner@wittgenstein|~
    > chown 1000:1000 ./suid.dir/
    brauner@wittgenstein|~
    > perms ./suid.dir/
    drwxrwxr-x 775 ./suid.dir/
meaning that both s{g,u}id bits are cleared for directories.

Since the vfs will always ensure that s{g,u}id bits are stripped for
anything that isn't a directory in the vfs:
- ATTR_KILL_S{G,U}ID is raised in chown_common():

	if (!S_ISDIR(inode->i_mode))
		newattrs.ia_valid |=
			ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;

- and then in notify_change() we'll get the bits stripped and ATTR_MODE
   raised:

	if (ia_valid & ATTR_KILL_SUID) {
		if (mode & S_ISUID) {
			ia_valid = attr->ia_valid |= ATTR_MODE;
			attr->ia_mode = (inode->i_mode & ~S_ISUID);
		}
	}
	if (ia_valid & ATTR_KILL_SGID) {
		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {

So SGID is not killed if there is no S_IXGRP (yet no capability check)

Actually I do not really understand why do kernel expects filesystems to further apply restrictions with CAP_FSETID. Why not kill it here since we have all info?

			if (!(ia_valid & ATTR_MODE)) {
				ia_valid = attr->ia_valid |= ATTR_MODE;
				attr->ia_mode = inode->i_mode;
			}
			attr->ia_mode &= ~S_ISGID;
		}
	}

we can change this codepath to just mirror setattr_copy() or switch
fully to setattr_copy() (if feasible).

Because as of right now the code seems to imply that the xfs code itself
is responsible for stripping s{g,u}id bits for all files whereas it is
the vfs that does it for any non-directory. So I'd propose to either try
and switch that code to setattr_copy() or to do open-code the
setattr_copy() check:

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index b79b3846e71b..ff55b31521a2 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -748,9 +748,13 @@ xfs_setattr_nonsize(
                  * The set-user-ID and set-group-ID bits of a file will be
                  * cleared upon successful return from chown()
                  */
-               if ((inode->i_mode & (S_ISUID|S_ISGID)) &&
-                   !capable(CAP_FSETID))
-                       inode->i_mode &= ~(S_ISUID|S_ISGID);
+               if (iattr->ia_valid & ATTR_MODE) {
+                       umode_t mode = iattr->ia_mode;
+                       if (!in_group_p(i_gid_into_mnt(mnt_userns, inode)) &&
+                           !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
+                               mode &= ~S_ISGID;
+                       inode->i_mode = mode;
+               }

                 /*
                  * Change the ownerships and register quota modifications

which aligns xfs with ext4 and any other filesystem. Any thoughts on
this?

For @Andrey specifically: the tests these should go into:

https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/src/idmapped-mounts/idmapped-mounts.c

Note that there are already setgid inheritance tests and set*id
execution tests in there.
You should be able to copy a lot of code from them. Could you please add
the test I sketched above and ideally also a test that the set{g,u}id
bits are stripped during chown for regular files?
Thanks for the link. To clarify what tests and result you expect:
- for directory chown we expect to preserve s{g,u}id
- for regfile chown we expect to preserve S_ISGID only if S_IXGRP is absent and CAP_FSETID is present

JFYI: I found out this problem while running LTP (specifically syscalls/chown02 test) on idmapped XFS. Maybe I will be able to find more, who knows.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux