On Wed, Jul 12, 2023 at 11:56:35AM +0200, Christian Brauner wrote: > Changing the mode of symlinks is meaningless as the vfs doesn't take the > mode of a symlink into account during path lookup permission checking. > > However, the vfs doesn't block mode changes on symlinks. This however, > has lead to an untenable mess roughly classifiable into the following > two categories: > > (1) Filesystems that don't implement a i_op->setattr() for symlinks. > > Such filesystems may or may not know that without i_op->setattr() > defined, notify_change() falls back to simple_setattr() causing the > inode's mode in the inode cache to be changed. > > That's a generic issue as this will affect all non-size changing > inode attributes including ownership changes. > > Example: afs > > (2) Filesystems that fail with EOPNOTSUPP but change the mode of the > symlink nonetheless. > > Some filesystems will happily update the mode of a symlink but still > return EOPNOTSUPP. This is the biggest source of confusion for > userspace. > > The EOPNOTSUPP in this case comes from POSIX ACLs. Specifically it > comes from filesystems that call posix_acl_chmod(), e.g., btrfs via > > if (!err && attr->ia_valid & ATTR_MODE) > err = posix_acl_chmod(idmap, dentry, inode->i_mode); > > Filesystems including btrfs don't implement i_op->set_acl() so > posix_acl_chmod() will report EOPNOTSUPP. > > When posix_acl_chmod() is called, most filesystems will have > finished updating the inode. > > Perversely, this has the consequences that this behavior may depend > on two kconfig options and mount options: > > * CONFIG_POSIX_ACL={y,n} > * CONFIG_${FSTYPE}_POSIX_ACL={y,n} > * Opt_acl, Opt_noacl > > Example: btrfs, ext4, xfs > > The only way to change the mode on a symlink currently involves abusing > an O_PATH file descriptor in the following manner: > > fd = openat(-1, "/path/to/link", O_CLOEXEC | O_PATH | O_NOFOLLOW); > > char path[PATH_MAX]; > snprintf(path, sizeof(path), "/proc/self/fd/%d", fd); > chmod(path, 0000); > > But for most major filesystems with POSIX ACL support such as btrfs, > ext4, ceph, tmpfs, xfs and others this will fail with EOPNOTSUPP with > the mode still updated due to the aforementioned posix_acl_chmod() > nonsense. > > So, given that for all major filesystems this would fail with EOPNOTSUPP > and that both glibc (cf. [1]) and musl (cf. [2]) outright block mode > changes on symlinks we should just try and block mode changes on > symlinks directly in the vfs and have a clean break with this nonsense. > > If this causes any regressions, we do the next best thing and fix up all > filesystems that do return EOPNOTSUPP with the mode updated to not call > posix_acl_chmod() on symlinks. > > But as usual, let's try the clean cut solution first. It's a simple > patch that can be easily reverted. Not marking this for backport as I'll > do that manually if we're reasonably sure that this works and there are > no strong objections. > > We could block this in chmod_common() but it's more appropriate to do it > notify_change() as it will also mean that we catch filesystems that > change symlink permissions explicitly or accidently. > > Similar proposals were floated in the past as in [3] and [4] and again > recently in [5]. There's also a couple of bugs about this inconsistency > as in [6] and [7]. > > Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=99527a3727e44cb8661ee1f743068f108ec93979;hb=HEAD [1] > Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c [2] > Link: https://lore.kernel.org/all/20200911065733.GA31579@xxxxxxxxxxxxx [3] > Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00518.html [4] > Link: https://lore.kernel.org/lkml/87lefmbppo.fsf@xxxxxxxxxxxxxxxxxxxxxxxx [5] > Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html [6] > Link: https://sourceware.org/bugzilla/show_bug.cgi?id=14578#c17 [7] > Cc: stable@xxxxxxxxxxxxxxx # no backport before v6.6-rc2 is tagged How far back should this go? thanks, greg k-h