On Wed, Jul 12, 2023 at 06:21:51PM +0200, Greg KH wrote: > On Wed, Jul 12, 2023 at 11:56:35AM +0200, Christian Brauner wrote: > > Changing the mode of symlinks is meaningless as the vfs doesn't take the > > mode of a symlink into account during path lookup permission checking. > > > > However, the vfs doesn't block mode changes on symlinks. This however, > > has lead to an untenable mess roughly classifiable into the following > > two categories: > > > > (1) Filesystems that don't implement a i_op->setattr() for symlinks. > > > > Such filesystems may or may not know that without i_op->setattr() > > defined, notify_change() falls back to simple_setattr() causing the > > inode's mode in the inode cache to be changed. > > > > That's a generic issue as this will affect all non-size changing > > inode attributes including ownership changes. > > > > Example: afs > > > > (2) Filesystems that fail with EOPNOTSUPP but change the mode of the > > symlink nonetheless. > > > > Some filesystems will happily update the mode of a symlink but still > > return EOPNOTSUPP. This is the biggest source of confusion for > > userspace. > > > > The EOPNOTSUPP in this case comes from POSIX ACLs. Specifically it > > comes from filesystems that call posix_acl_chmod(), e.g., btrfs via > > > > if (!err && attr->ia_valid & ATTR_MODE) > > err = posix_acl_chmod(idmap, dentry, inode->i_mode); > > > > Filesystems including btrfs don't implement i_op->set_acl() so > > posix_acl_chmod() will report EOPNOTSUPP. > > > > When posix_acl_chmod() is called, most filesystems will have > > finished updating the inode. > > > > Perversely, this has the consequences that this behavior may depend > > on two kconfig options and mount options: > > > > * CONFIG_POSIX_ACL={y,n} > > * CONFIG_${FSTYPE}_POSIX_ACL={y,n} > > * Opt_acl, Opt_noacl > > > > Example: btrfs, ext4, xfs > > > > The only way to change the mode on a symlink currently involves abusing > > an O_PATH file descriptor in the following manner: > > > > fd = openat(-1, "/path/to/link", O_CLOEXEC | O_PATH | O_NOFOLLOW); > > > > char path[PATH_MAX]; > > snprintf(path, sizeof(path), "/proc/self/fd/%d", fd); > > chmod(path, 0000); > > > > But for most major filesystems with POSIX ACL support such as btrfs, > > ext4, ceph, tmpfs, xfs and others this will fail with EOPNOTSUPP with > > the mode still updated due to the aforementioned posix_acl_chmod() > > nonsense. > > > > So, given that for all major filesystems this would fail with EOPNOTSUPP > > and that both glibc (cf. [1]) and musl (cf. [2]) outright block mode > > changes on symlinks we should just try and block mode changes on > > symlinks directly in the vfs and have a clean break with this nonsense. > > > > If this causes any regressions, we do the next best thing and fix up all > > filesystems that do return EOPNOTSUPP with the mode updated to not call > > posix_acl_chmod() on symlinks. > > > > But as usual, let's try the clean cut solution first. It's a simple > > patch that can be easily reverted. Not marking this for backport as I'll > > do that manually if we're reasonably sure that this works and there are > > no strong objections. > > > > We could block this in chmod_common() but it's more appropriate to do it > > notify_change() as it will also mean that we catch filesystems that > > change symlink permissions explicitly or accidently. > > > > Similar proposals were floated in the past as in [3] and [4] and again > > recently in [5]. There's also a couple of bugs about this inconsistency > > as in [6] and [7]. > > > > Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=99527a3727e44cb8661ee1f743068f108ec93979;hb=HEAD [1] > > Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c [2] > > Link: https://lore.kernel.org/all/20200911065733.GA31579@xxxxxxxxxxxxx [3] > > Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00518.html [4] > > Link: https://lore.kernel.org/lkml/87lefmbppo.fsf@xxxxxxxxxxxxxxxxxxxxxxxx [5] > > Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html [6] > > Link: https://sourceware.org/bugzilla/show_bug.cgi?id=14578#c17 [7] > > Cc: stable@xxxxxxxxxxxxxxx # no backport before v6.6-rc2 is tagged > > How far back should this go? If this holds up without regressions than all LTSes. That's what Amir and Leah did for some other work. I can add that to the comment for clarity.