On Thu, Apr 28, 2022 at 10:45:01AM +0200, Christian Brauner wrote: > On Thu, Apr 28, 2022 at 02:49:11AM +0000, Al Viro wrote: > > On Thu, Apr 28, 2022 at 02:23:25AM +0000, xuyang2018.jy@xxxxxxxxxxx wrote: > > > > > > BTW, xfs has grpid option as well: > > > > if (dir&& !(dir->i_mode& S_ISGID)&& xfs_has_grpid(mp)) { > > > > inode_fsuid_set(inode, mnt_userns); > > > > inode->i_gid = dir->i_gid; > > > > inode->i_mode = mode; > > > > } else { > > > > inode_init_owner(mnt_userns, inode, dir, mode); > > > > } > > > > > > > > We could lift that stuff into VFS, but it would require lifting that flag > > > > (BSD vs. SysV behaviour wrt GID - BSD *always* inherits GID from parent > > > > and ignores SGID on directories) into generic superblock. Otherwise we'd > > > > be breaking existing behaviour for ext* and xfs... > > > > > > I also mentioned it in my previous version(in the 3/4 patch) > > > "This patch also changed grpid behaviour for ext4/xfs because the mode > > > passed to them may been changed by vfs_prepare_mode. > > > " > > > > > > I guess we can add a grpid option check in vfs_prepare_mode or in > > > mode_strip_sgid, then it should not break the existing behaviour for > > > ext* and xfs. > > > > I don't like it, TBH. That way we have > > 1) caller mangles mode (after having looked at grpid, etc.) > > 2) filesystem checks grpid and either calls inode_init_owner(), > > which might (or might not) modify the gid to be used or skips the > > call and assigns gid directly. > > > > It's asking for trouble. We have two places where the predicate is > > checked; the first mangles mode (and I'm still not convinced that we > > need to bother with that at all), the second affects gid (and for > > mkdir in sgid directory on non-grpid ones inherits sgid). > > > > That kind of structure is asking for trouble. *IF* we make inode_init_owner() > > aware of grpid (and make that predicate usable from generic helper), we might > > as well just make inode_init_owner() mandatory (for the first time ever) and > > get rid of grpid checks in filesystems themselves. But then there's no point > > doing it in method callers. > > This has ordering issues. In the vfs the umask is stripped and then we > call into the filesystem and inode_init_owner() is called. So if > S_IXGRP is removed by the umask then we inherit the S_ISGID bit. > > But if the filesystem uses POSIX ACLs then the umask isn't stripped in > the vfs and instead may be done (I say "may" because it depends on > whether or not applicable POSIX ACLs are found) in posix_acl_create(). > > But in order to call posix_acl_create() the filesystem will have often > ran through inode_init_owner() first. For example, ext4 does > inode_init_owner() and afterwards calls ext4_init_acl() which in turn > ends up calling posix_acl_create() which _may_ strip the umask. > > Iow, you end up with two possible setgid removal paths: > * strip setgid first, apply umask (no POSIX ACLs supported) > * apply umask, strip setgid (POSIX ACLs supported) Ugh, that's reversed: * POSIX ACLs supported: 1. strip setgid first 2. (possibly) strip umask * POSIX ACLS unsupported: 1. apply umask 2. strip setgid > with possibly different results. > > Mandating inode_init_owner() being used doesn't solve that and I think > it's still brittle overall. > > If we can hoist all of this into vfs_*() before we call into the > filesystem we're better off in the long run. It's worth the imho > negible regression risk. > > > > > Note, BTW, that while XFS has inode_fsuid_set() on the non-inode_init_owner() > > path, it doesn't have inode_fsgid_set() there. Same goes for ext4, while > > ext2 doesn't bother with either in such case... > > Using inode_fs*id_set() is only relevant when the inode is initialized > based on the caller's fs*id. If you request to inherit the parent > directories gid then the caller's gid doesn't matter. > > ext2 doesn't need to care at all about this because it doesn't raise > FS_ALLOW_IDMAP. > > > > > Let's try to separate the issues here. Jann, could you explain what makes > > empty sgid files dangerous? > > I see that's answered in a later mail.