On Wed 11-10-23 12:51:12, Max Kellermann wrote: > On Wed, Oct 11, 2023 at 12:05 PM Jan Kara <jack@xxxxxxx> wrote: > > So I've checked some more and the kernel doc comments before > > mode_strip_umask() and vfs_prepare_mode() make it pretty obvious - all > > paths creating new inodes must be calling vfs_prepare_mode(). As a result > > mode_strip_umask() which handles umask stripping for filesystems not > > supporting posix ACLs. For filesystems that do support ACLs, > > posix_acl_create() must be call and that handles umask stripping. So your > > fix should not be needed. CCed some relevant people for confirmation. > > Thanks, Jan. Do you think the documentation is obvious enough, or > shall I look around and try to improve the documentation? I'm not a FS > expert, so it may be just my fault that it confused me.... I just > analyzed the O_TMPFILE vulnerability four years ago (because it was > reported to me as the maintainer of a userspace software). > > Apart from my doubts that this API contract is too error prone, I'm > not quite sure if all filesystems really implement it properly. > > For example, overlayfs unconditionally sets SB_POSIXACL, even if the > kernel has no ACL support. Would this ignore the umask? I'm not sure, > overlayfs is a special beast. > Then there's orangefs which allows setting the "acl" mount option (and > thus SB_POSIXACL) even if the kernel has no ACL support. Same for gfs2 > and maybe cifs, maybe more, I didn't check them all. Indeed, *that* looks like a bug. Good spotting! I'd say posix_acl_create() defined in include/linux/posix_acl.h for the !CONFIG_FS_POSIX_ACL case should be stripping mode using umask. Care to send a patch for this? > The "mainstream" filesystems like ext4 seem to be implemented > properly, though this is still too fragile for my taste... ext4 has > the SB_POSIXACL code even if there's no kernel ACL support, but it is > not reachable because EXT4_MOUNT_POSIX_ACL cannot be set from > userspace in that case. The code looks suspicious, but is okay in the > end - still not my taste. > > I see so much redundant code regarding the "acl" mount option in all > filesystems. I believe the API should be designed in a way that it is > safe-by-default, and shouldn't need very careful considerations in > each and every filesystem, or else all filesystems repeat the same > mistakes until the last one gets fixed. So I definitely agree that we should handle as many things as possible in VFS without relying on filesystems to get it right. Thus I agree VFS should do the right thing even if the filesystem sets SB_POSIXACl when !CONFIG_FS_POSIX_ACL. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR