On Tue, 2022-04-26 at 19:11 +0800, Yang Xu wrote: > Add a dedicated helper to handle the setgid bit when creating a new file > in a setgid directory. This is a preparatory patch for moving setgid > stripping into the vfs. The patch contains no functional changes. > > Currently the setgid stripping logic is open-coded directly in > inode_init_owner() and the individual filesystems are responsible for > handling setgid inheritance. Since this has proven to be brittle as > evidenced by old issues we uncovered over the last months (see [1] to > [3] below) we will try to move this logic into the vfs. > > Link: e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") [1] > Link: 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") [2] > Link: fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories") [3] > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> > Reviewed-by: Christian Brauner (Microsoft) <brauner@xxxxxxxxxx> > Signed-off-by: Yang Xu <xuyang2018.jy@xxxxxxxxxxx> > --- > fs/inode.c | 37 +++++++++++++++++++++++++++++++++---- > include/linux/fs.h | 2 ++ > 2 files changed, 35 insertions(+), 4 deletions(-) > > diff --git a/fs/inode.c b/fs/inode.c > index 9d9b422504d1..e9a5f2ec2f89 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -2246,10 +2246,8 @@ void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode, > /* Directories are special, and always inherit S_ISGID */ > if (S_ISDIR(mode)) > mode |= S_ISGID; > - else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) && > - !in_group_p(i_gid_into_mnt(mnt_userns, dir)) && > - !capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID)) > - mode &= ~S_ISGID; > + else > + mode = mode_strip_sgid(mnt_userns, dir, mode); > } else > inode_fsgid_set(inode, mnt_userns); > inode->i_mode = mode; > @@ -2405,3 +2403,34 @@ struct timespec64 current_time(struct inode *inode) > return timestamp_truncate(now, inode); > } > EXPORT_SYMBOL(current_time); > + > +/** > + * mode_strip_sgid - handle the sgid bit for non-directories > + * @mnt_userns: User namespace of the mount the inode was created from > + * @dir: parent directory inode > + * @mode: mode of the file to be created in @dir > + * > + * If the @mode of the new file has both the S_ISGID and S_IXGRP bit > + * raised and @dir has the S_ISGID bit raised ensure that the caller is > + * either in the group of the parent directory or they have CAP_FSETID > + * in their user namespace and are privileged over the parent directory. > + * In all other cases, strip the S_ISGID bit from @mode. > + * > + * Return: the new mode to use for the file > + */ > +umode_t mode_strip_sgid(struct user_namespace *mnt_userns, > + const struct inode *dir, umode_t mode) > +{ > + if (S_ISDIR(mode) || !dir || !(dir->i_mode & S_ISGID)) > + return mode; > + if ((mode & (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP)) > + return mode; > + if (in_group_p(i_gid_into_mnt(mnt_userns, dir))) > + return mode; > + if (capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID)) > + return mode; > + > + mode &= ~S_ISGID; > + return mode; > +} > +EXPORT_SYMBOL(mode_strip_sgid); > diff --git a/include/linux/fs.h b/include/linux/fs.h > index bbde95387a23..98b44a2732f5 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1897,6 +1897,8 @@ extern long compat_ptr_ioctl(struct file *file, unsigned int cmd, > void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode, > const struct inode *dir, umode_t mode); > extern bool may_open_dev(const struct path *path); > +umode_t mode_strip_sgid(struct user_namespace *mnt_userns, > + const struct inode *dir, umode_t mode); > > /* > * This is the "filldir" function type, used by readdir() to let This series looks like a nice cleanup. I went ahead and added this pile to another kernel I was testing with xfstests and it seemed to do fine. You can add this (or some variant of it) to all 4 patches. Reviewed-and-Tested-by: Jeff Layton <jlayton@xxxxxxxxxx>