on 2022/4/13 17:59, Christian Brauner write: > On Wed, Apr 13, 2022 at 09:45:11AM +0000, xuyang2018.jy@xxxxxxxxxxx wrote: >> on 2022/4/13 16:59, Christian Brauner wrote: >>> On Tue, Apr 12, 2022 at 07:33:45PM +0800, Yang Xu wrote: >>>> The current_umask() is stripped from the mode directly in the vfs if the >>>> filesystem either doesn't support acls or the filesystem has been >>>> mounted without posic acl support. >>>> >>>> If the filesystem does support acls then current_umask() stripping is >>>> deferred to posix_acl_create(). So when the filesystem calls >>>> posix_acl_create() and there are no acls set or not supported then >>>> current_umask() will be stripped. >>>> >>>> Here we only use umask(S_IXGRP) to check whether inode strip >>>> S_ISGID works correctly. >>>> >>>> Signed-off-by: Yang Xu<xuyang2018.jy@xxxxxxxxxxx> >>>> --- >>>> src/idmapped-mounts/idmapped-mounts.c | 505 +++++++++++++++++++++++++- >>>> tests/generic/680 | 26 ++ >>>> tests/generic/680.out | 2 + >>>> 3 files changed, 532 insertions(+), 1 deletion(-) >>>> create mode 100755 tests/generic/680 >>>> create mode 100644 tests/generic/680.out >>>> >>>> diff --git a/src/idmapped-mounts/idmapped-mounts.c b/src/idmapped-mounts/idmapped-mounts.c >>>> index 02f91558..e6c14586 100644 >>>> --- a/src/idmapped-mounts/idmapped-mounts.c >>>> +++ b/src/idmapped-mounts/idmapped-mounts.c >>>> @@ -14146,6 +14146,494 @@ out: >>>> return fret; >>>> } >>>> >>>> +/* The following tests are concerned with setgid inheritance. These can be >>>> + * filesystem type specific. For xfs, if a new file or directory or node is >>>> + * created within a setgid directory and irix_sgid_inhiert is set then inheritthe >>>> + * setgid bit if the caller is in the group of the directory. >>>> + * >>>> + * The current_umask() is stripped from the mode directly in the vfs if the >>>> + * filesystem either doesn't support acls or the filesystem has been >>>> + * mounted without posic acl support. >>>> + * >>>> + * If the filesystem does support acls then current_umask() stripping is >>>> + * deferred to posix_acl_create(). So when the filesystem calls >>>> + * posix_acl_create() and there are no acls set or not supported then >>>> + * current_umask() will be stripped. >>>> + * >>>> + * Use umask(S_IXGRP) to check whether inode strip S_ISGID works correctly. >>>> + */ >>>> +static int setgid_create_umask(void) >>>> +{ >>>> + int fret = -1; >>>> + int file1_fd = -EBADF; >>>> + int tmpfile_fd = -EBADF; >>>> + pid_t pid; >>>> + bool supported = false; >>>> + char path[PATH_MAX]; >>>> + mode_t mode; >>>> + >>>> + if (!caps_supported()) >>>> + return 0; >>>> + >>>> + if (fchmod(t_dir1_fd, S_IRUSR | >>>> + S_IWUSR | >>>> + S_IRGRP | >>>> + S_IWGRP | >>>> + S_IROTH | >>>> + S_IWOTH | >>>> + S_IXUSR | >>>> + S_IXGRP | >>>> + S_IXOTH | >>>> + S_ISGID), 0) { >>>> + log_stderr("failure: fchmod"); >>>> + goto out; >>>> + } >>>> + >>>> + /* Verify that the setgid bit got raised. */ >>>> + if (!is_setgid(t_dir1_fd, "", AT_EMPTY_PATH)) { >>>> + log_stderr("failure: is_setgid"); >>>> + goto out; >>>> + } >>>> + >>>> + supported = openat_tmpfile_supported(t_dir1_fd); >>>> + >>>> + /* Only umask with S_IXGRP because inode strip S_ISGID will check mode >>>> + * whether has group execute or search permission. >>>> + */ >>>> + umask(S_IXGRP); >>>> + mode = umask(S_IXGRP); >>>> + if (!(mode& S_IXGRP)) >>>> + die("failure: umask"); >>>> + >>>> + pid = fork(); >>>> + if (pid< 0) { >>>> + log_stderr("failure: fork"); >>>> + goto out; >>>> + } >>>> + if (pid == 0) { >>>> + if (!switch_ids(0, 10000)) >>>> + die("failure: switch_ids"); >>>> + >>>> + if (!caps_down_fsetid()) >>>> + die("failure: caps_down_fsetid"); >>>> + >>>> + /* create regular file via open() */ >>>> + file1_fd = openat(t_dir1_fd, FILE1, O_CREAT | O_EXCL | O_CLOEXEC, S_IXGRP | S_ISGID); >>>> + if (file1_fd< 0) >>>> + die("failure: create"); >>>> + >>>> + /* Neither in_group_p() nor capable_wrt_inode_uidgid() so setgid >>>> + * bit needs to be stripped. >>>> + */ >>>> + if (is_setgid(t_dir1_fd, FILE1, 0)) >>>> + die("failure: is_setgid"); >>> >>> This test is wrong. Specifically, it is a false positive. I have >>> explained this in more detail on v2 of this patchset. >>> >>> You want to test that umask(S_IXGRP) + setgid inheritance work together >>> correctly. First, we need to establish what that means because from your >>> patch series it at least seems to me as you're not completely clear >>> about what you want to test just yet. >>> >>> A requested setgid bit for a non-directory object is only considered for >>> stripping if S_IXGRP is simultaneously requested. In other words, both >>> S_SISGID and S_IXGRP must be requested for the new file in order for >>> additional checks such as CAP_FSETID to become relevant. >> Yes, only keep S_IXGRP, then we can run into the next judgement for >> group and cap_fsetid. >>> >>> We need to distinguish two cases afaict: >>> >>> 1. The filesystem does support ACLs and has an applicable ACL >>> ------------------------------------------------------------- >>> >>> umask(S_IXGRP) is not used by the kernel and thus S_IXGRP is not >>> stripped (unless the ACL does make it so) and so when e.g. >>> inode_init_owner() is called we do not expect the file to inherit the >>> setgid bit. >>> >>> This is what your test above is handling correctly. But it is a false >>> positive because what you're trying to test is the behavior of >>> umask(S_IXGRP) but it is made irrelevant by ACL support of the >>> underlying filesystem. >> I test this situation in the next patch as below: >> umask(S_IXGRP); >> snprintf(t_buf, sizeof(t_buf), "setfacl -d -m u::rwx,g::rwx,o::rwx,m:rw >> %s/%s", t_mountpoint, T_DIR1) >> >> and >> umask(S_IXGRP); >> snprintf(t_buf, sizeof(t_buf), "setfacl -d -m u::rwx,g::rw,o::rwx, >> %s/%s", t_mountpoint, T_DIR1 >> >>> >>> 2. The filesystem does not support ACLs, has been mounted without >>> ----------------------------------------------------------------- >>> support for it, or has no applicable ACL >>> ---------------------------------------- >>> >>> umask(S_IXGRP) is used by the kernel and will be stripped from the mode. >>> So when inode_init_owner() is called we expect the file to inherit the >>> setgid bit. >> This is why my kernel patch put strip setgid code into vfs. >> xfs will inherit the setgid bit but ext4 not because the new_inode >> function and posix_acl_create function order(S_IXGRP mode has been >> stripped before pass to inode_init_owner). >>> >>> This means the test for this case needs to be: >>> >>> file1_fd = openat(t_dir1_fd, FILE1, O_CREAT | O_EXCL | O_CLOEXEC, S_IXGRP | S_ISGID); >>> if (file1_fd< 0) >>> die("failure: create"); >>> >>> /* >>> * S_IXGRP will have been removed due to umask(S_IXGRP) so we expect the >>> * new file to be S_ISGID. >> No No No, see the kernel patch url >> https://patchwork.kernel.org/project/linux-fsdevel/patch/1648461389-2225-2-git-send-email-xuyang2018.jy@xxxxxxxxxxx/ > > Ok, so your testing patches are premised on your kernel patchset. That > wasn't clear to me. > Because of Darrick's request when reviewing this kernel patchset, so then I begin to refactor setgid_create code in fstets... I should have added this in this patch' commit message, sorry. Will add it in v4. > The kernel patchset removes the setgid bit _before_ the umask is applied > which is why you're expecting this file to not be setgid because you > also dropped CAP_FSETID and you'r not in the group of the directory. Yes. > > (Ok, let's defer that discussion to the updated patchset.) I am rewritting the kernel patch commit message to clarify why put this strip code into vfs is safer, but I still have several filesystem code not to see. I will send a v2 kernel patchset tomorrow. Best Regards Yang Xu