On Thu, Mar 10, 2022 at 09:34:50AM -0800, Darrick J. Wong wrote: > On Thu, Mar 10, 2022 at 11:11:58AM +0100, Christian Brauner wrote: > > On Wed, Mar 09, 2022 at 11:22:11AM -0800, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid > > > revocation isn't consistent with btrfs[1] or ext4. Those two > > > filesystems use the VFS function setattr_copy to convey certain > > > attributes from struct iattr into the VFS inode structure. > > > > > > Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to > > > decide if it should clear setgid and setuid on a file attribute update. > > > This is a second symptom of the problem that Filipe noticed. > > > > > > XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode, > > > xfs_setattr_nonsize, and xfs_setattr_time. Regrettably, setattr_copy is > > > /not/ a simple copy function; it contains additional logic to clear the > > > setgid bit when setting the mode, and XFS' version no longer matches. > > > > > > The VFS implements its own setuid/setgid stripping logic, which > > > establishes consistent behavior. It's a tad unfortunate that it's > > > scattered across notify_change, should_remove_suid, and setattr_copy but > > > XFS should really follow the Linux VFS. Adapt XFS to use the VFS > > > functions and get rid of the old functions. > > > > > > [1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y2KCV25Q@xxxxxxxxxxxxxx/ > > > [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@xxxxxxxxxxxxx/ > > > > > > Fixes: f736d93d76d3 ("xfs: support idmapped mounts") > > > > Fwiw, as I've pointed out in > > https://lore.kernel.org/linux-xfs/20220222122331.ijeapomur76h7xf6@wittgenstein/ > > the original analysis that this commit message links to in [2] is not > > correct. But the thread clarifies it so I think it's fine. > > > > But I think the fixes tag is wrong here afaict. As I've pointed out in > > https://lore.kernel.org/linux-xfs/20220222123656.433l67bxhv3s2vbo@wittgenstein/ > > the faulty behavior should predate idmapped mounts by a lot. The bug is > > with capable(CAP_FSETID). A simple reproducer that should work on any > > pre 5.12 kernel is: > > > > brauner@wittgenstein|~/src/git/linux/ltp/testcases/kernel/syscalls/chown|master %= > > > unshare -U --map-root > > root@wittgenstein|~/src/git/linux/ltp/testcases/kernel/syscalls/chown|master %= > > > PATH=$PATH:$PWD ./chown02 > > tst_memutils.c:157: TWARN: Can't adjust score, even with capabilities!? > > tst_test.c:1455: TINFO: Timeout per run is 0h 05m 00s > > chown02.c:45: TPASS: chown(testfile1, 0, 0) passed > > chown02.c:45: TPASS: chown(testfile2, 0, 0) passed > > chown02.c:57: TFAIL: testfile2: wrong mode permissions 0100700, expected 0102700 > > > > Summary: > > passed 2 > > failed 1 > > broken 0 > > skipped 0 > > warnings 1 > > > > There's no idmapped mounts here in play. The caller simply has been > > placed in a new user namespace and thus they fail the current > > capable(CAP_FSETID) check which will cause xfs to strip the sgid bit. > > > > Now trying the same with ext4: > > > > ubuntu@f2-vm:~/src/git/linux/ltp/testcases/kernel/syscalls/chown$ unshare -U --map-root > > root@f2-vm:~/src/git/linux/ltp/testcases/kernel/syscalls/chown# PATH=$PATH:$PWD ./chown02 > > tst_memutils.c:157: TWARN: Can't adjust score, even with capabilities!? > > tst_test.c:1455: TINFO: Timeout per run is 0h 05m 00s > > chown02.c:45: TPASS: chown(testfile1, 0, 0) passed > > chown02.c:45: TPASS: chown(testfile2, 0, 0) passed > > > > Summary: > > passed 2 > > failed 0 > > broken 0 > > skipped 0 > > warnings 1 > > > > it passes since ext4 uses setattr_copy() and thus the capability is > > checked for in the caller's user namespace. > > Hmm. The last person to touch the ATTR_MODE part of setattr_copy was > you, back in January 2021: > > 2f221d6f7b88 ("attr: handle idmapped mounts") > > though I think that was merely switching the user_ns parameter to > inode_capable_wrt_uidgid. The previous major change was made by Andy > Lutomirski in 2014: > > 23adbe12ef7d ("fs,userns: Change inode_capable to capable_wrt_inode_uidgid") > > This seems to be the start(?) of the "_wrt_inode_uidgid" variants, > though I think the only real change in behavior was checking that the > inode's gid is mapped in the current userns? Going back even further, > it looks like Eric Biederman started thsi whole process in 2012 with: > > 7fa294c8991c ("userns: Allow chown and setgid preservation") > > This patch switched the VFS from purely checking process capabilities to > checking the privileges of the userns, I think. From a purely code > inspection perspective, this is where the behavior divergence between > XFS and VFS began. > > I'm ok with switching the fixes tag to 7fa294, though I won't be shocked > if this patch doesn't apply cleanly to fs/xfs/ from that era. It very likely won't but from what I understand it would hopefully only need to be backported to 5.<something>. But even then there's a good chance it won't apply. > > Thanks for the review! > > <rant> > That said, I now have this bad habit of picking more recent commits for > Fixes because I get so much blowback from the stable maintainers it's > not worth any of the frustration. > > I *do* still agree with the principle that a fixpatch should reference > the exact moment things went wrong, even if some autobackport bot can't > trivially apply the patch! > > I fully expect to get complaints from the LTS maintainers like I always > do when I attach a Fixes tag without throughly compile-testing every LTS > branch between the tag and HEAD. They clearly haven't taken any of my Oh, I didn't even know that that was expected. That's a rather time intensive ask. I understand why that is frustrating! :( > hints, so I'll just say it: I DON'T HAVE TIME TO QA ANYTHING OTHER THAN > UPSTREAM! Six LTS kernels (plus 3 distro kernels) will eat a week and a > half of time on the testing cloud, **per fix**. > > Hell, you all have probably noticed: I haven't had sufficient mental > spoons to do much of anything with upstream, so there is no 5.18 merge > branch and reviews are getting sloppier. > </rant>