On Wed, Mar 09, 2022 at 11:22:11AM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid > revocation isn't consistent with btrfs[1] or ext4. Those two > filesystems use the VFS function setattr_copy to convey certain > attributes from struct iattr into the VFS inode structure. > > Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to > decide if it should clear setgid and setuid on a file attribute update. > This is a second symptom of the problem that Filipe noticed. > > XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode, > xfs_setattr_nonsize, and xfs_setattr_time. Regrettably, setattr_copy is > /not/ a simple copy function; it contains additional logic to clear the > setgid bit when setting the mode, and XFS' version no longer matches. > > The VFS implements its own setuid/setgid stripping logic, which > establishes consistent behavior. It's a tad unfortunate that it's > scattered across notify_change, should_remove_suid, and setattr_copy but > XFS should really follow the Linux VFS. Adapt XFS to use the VFS > functions and get rid of the old functions. > > [1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y2KCV25Q@xxxxxxxxxxxxxx/ > [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@xxxxxxxxxxxxx/ > > Fixes: f736d93d76d3 ("xfs: support idmapped mounts") Fwiw, as I've pointed out in https://lore.kernel.org/linux-xfs/20220222122331.ijeapomur76h7xf6@wittgenstein/ the original analysis that this commit message links to in [2] is not correct. But the thread clarifies it so I think it's fine. But I think the fixes tag is wrong here afaict. As I've pointed out in https://lore.kernel.org/linux-xfs/20220222123656.433l67bxhv3s2vbo@wittgenstein/ the faulty behavior should predate idmapped mounts by a lot. The bug is with capable(CAP_FSETID). A simple reproducer that should work on any pre 5.12 kernel is: brauner@wittgenstein|~/src/git/linux/ltp/testcases/kernel/syscalls/chown|master %= > unshare -U --map-root root@wittgenstein|~/src/git/linux/ltp/testcases/kernel/syscalls/chown|master %= > PATH=$PATH:$PWD ./chown02 tst_memutils.c:157: TWARN: Can't adjust score, even with capabilities!? tst_test.c:1455: TINFO: Timeout per run is 0h 05m 00s chown02.c:45: TPASS: chown(testfile1, 0, 0) passed chown02.c:45: TPASS: chown(testfile2, 0, 0) passed chown02.c:57: TFAIL: testfile2: wrong mode permissions 0100700, expected 0102700 Summary: passed 2 failed 1 broken 0 skipped 0 warnings 1 There's no idmapped mounts here in play. The caller simply has been placed in a new user namespace and thus they fail the current capable(CAP_FSETID) check which will cause xfs to strip the sgid bit. Now trying the same with ext4: ubuntu@f2-vm:~/src/git/linux/ltp/testcases/kernel/syscalls/chown$ unshare -U --map-root root@f2-vm:~/src/git/linux/ltp/testcases/kernel/syscalls/chown# PATH=$PATH:$PWD ./chown02 tst_memutils.c:157: TWARN: Can't adjust score, even with capabilities!? tst_test.c:1455: TINFO: Timeout per run is 0h 05m 00s chown02.c:45: TPASS: chown(testfile1, 0, 0) passed chown02.c:45: TPASS: chown(testfile2, 0, 0) passed Summary: passed 2 failed 0 broken 0 skipped 0 warnings 1 it passes since ext4 uses setattr_copy() and thus the capability is checked for in the caller's user namespace. > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > --- Other than the wrong Fixes:, Reviewed-by: Christian Brauner <brauner@xxxxxxxxxx>