Re: [PATCH v2 2/3] vfs: strip file's S_ISGID mode on vfs instead of on underlying filesystem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



on 2022/4/15 22:02, Christian Brauner wrote:
> On Fri, Apr 15, 2022 at 03:14:53AM +0000, xuyang2018.jy@xxxxxxxxxxx wrote:
>> on 2022/4/14 20:45, Christian Brauner wrote:
>>> On Thu, Apr 14, 2022 at 03:57:18PM +0800, Yang Xu wrote:
>>>> Currently, vfs only passes mode argument to filesystem, then use inode_init_owner()
>>>> to strip S_ISGID. Some filesystem(ie ext4/btrfs) will call inode_init_owner
>>>> firstly, then posxi acl setup, but xfs uses the contrary order. It will affect
>>>> S_ISGID clear especially we filter S_IXGRP by umask or acl.
>>>>
>>>> Regardless of which filesystem is in use, failure to strip the SGID correctly is
>>>> considered a security failure that needs to be fixed. The current VFS infrastructure
>>>> requires the filesystem to do everything right and not step on any landmines to
>>>> strip the SGID bit, when in fact it can easily be done at the VFS and the filesystems
>>>> then don't even need to be aware that the SGID needs to be (or has been stripped) by
>>>> the operation the user asked to be done.
>>>>
>>>> Vfs has all the info it needs - it doesn't need the filesystems to do everything
>>>> correctly with the mode and ensuring that they order things like posix acl setup
>>>> functions correctly with inode_init_owner() to strip the SGID bit.
>>>>
>>>> Just strip the SGID bit at the VFS, and then the filesystems can't get it wrong.
>>>>
>>>> Also, the inode_sgid_strip() api should be used before IS_POSIXACL() because
>>>> this api may change mode.
>>>>
>>>> Only the following places use inode_init_owner
>>>> "hugetlbfs/inode.c:846:          inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    nilfs2/inode.c:354:     inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    zonefs/super.c:1289:    inode_init_owner(&init_user_ns, inode, parent, S_IFDIR | 0555);
>>>>    reiserfs/namei.c:619:   inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    jfs/jfs_inode.c:67:     inode_init_owner(&init_user_ns, inode, parent, mode);
>>>>    f2fs/namei.c:50:        inode_init_owner(mnt_userns, inode, dir, mode);
>>>>    ext2/ialloc.c:549:              inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    overlayfs/dir.c:643:    inode_init_owner(&init_user_ns, inode, dentry->d_parent->d_inode, mode);
>>>>    ufs/ialloc.c:292:       inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    ntfs3/inode.c:1283:     inode_init_owner(mnt_userns, inode, dir, mode);
>>>>    ramfs/inode.c:64:               inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    9p/vfs_inode.c:263:     inode_init_owner(&init_user_ns, inode, NULL, mode);
>>>>    btrfs/tests/btrfs-tests.c:65:   inode_init_owner(&init_user_ns, inode, NULL, S_IFREG);
>>>>    btrfs/inode.c:6215:     inode_init_owner(mnt_userns, inode, dir, mode);
>>>>    sysv/ialloc.c:166:      inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    omfs/inode.c:51:        inode_init_owner(&init_user_ns, inode, NULL, mode);
>>>>    ubifs/dir.c:97: inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    udf/ialloc.c:108:       inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    ext4/ialloc.c:979:              inode_init_owner(mnt_userns, inode, dir, mode);
>>>>    hfsplus/inode.c:393:    inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    xfs/xfs_inode.c:840:            inode_init_owner(mnt_userns, inode, dir, mode);
>>>>    ocfs2/dlmfs/dlmfs.c:331:                inode_init_owner(&init_user_ns, inode, NULL, mode);
>>>>    ocfs2/dlmfs/dlmfs.c:354:        inode_init_owner(&init_user_ns, inode, parent, mode);
>>>>    ocfs2/namei.c:200:      inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    minix/bitmap.c:255:     inode_init_owner(&init_user_ns, inode, dir, mode);
>>>>    bfs/dir.c:99:   inode_init_owner(&init_user_ns, inode, dir, mode);
>>>> "
>>>
>>> For completeness sake, there's also spufs which doesn't really go
>>> through the regular VFS callpath because it has separate system calls
>>> like:
>>>
>>> SYSCALL_DEFINE4(spu_create, const char __user *, name, unsigned int, flags,
>>> 	umode_t, mode, int, neighbor_fd)
>>>
>>> but looking through the code it only allows the creation of directories and only
>>> allows bits in 0777.
>> IMO, this fs also doesn't use inode_init_owner, so it should be not
>> affected. We add indo_sgid_strip into vfs, IMO, it only happen new sgid
>> strip situation and doesn't happen to remove old sgid strip situation.
>> So I think it is "safe".
>
> It does. The callchain spu_create() with SP_CREATE_GANG ends up in
> spufs_mkgang() which calls inode_init_owner(). But as I said it's not a
> problem since this only creates directories anyway.
Sorry, I miss this message before.
Oh, Yes, I only search inode_init_owner in linux/fs directory instead of 
the whole linux source directory.

It seems I also miss bpf and shmem.

kernel/bpf/inode.c:     inode_init_owner(&init_user_ns, inode, dir, mode);
mm/shmem.c:             inode_init_owner(&init_user_ns, inode, dir, mode);
arch/powerpc/platforms/cell/spufs/inode.c: 
inode_init_owner(&init_user_ns, inode, dir, mode | S_IFDIR);
arch/powerpc/platforms/cell/spufs/inode.c: 
inode_init_owner(&init_user_ns, inode, dir, mode | S_IFDIR)

bpf use vfs_mkobj in bpf_obj_do_pin with "S_IFREG | ((S_IRUSR | S_IWUSR) 
& ~current_umask()) mode and use bpf_mkobj_ops in 
bpf_iter_link_pin_kernel with S_IFREG | S_IRUSR; , so bpf is also not 
affected.

Also shmem used standard vfs api, so it is not affected. I will add the 
three missing things(spufs, bpf, shmem) in my commit message.

Best Regards
Yang Xu
>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux