On Mon, Nov 23, 2020 at 10:06 PM Andreas Dilger <adilger@xxxxxxxxx> wrote: > > On Nov 23, 2020, at 7:12 AM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > Dmitry found an issue with overlayfs's FS_IOC_SETFLAGS/FS_IOC_FSSETXATTR > > implementation: > > > > https://lore.kernel.org/linux-unionfs/CACT4Y+bUfavwMVv2SEMve5pabE_AwsDO0YsRBGZtYqX59a77vA@xxxxxxxxxxxxxx/ > > > > I think the only proper soltuion is to move these into inode operations, which > > should be a good cleanup as well. > > > > This is a first cut, the FS_IOC_SETFLAGS conversion is not complete, and only > > lightly tested on ext4 and xfs. > > > > There are minor changes in behavior, like the exact errno value in case of > > multiple error conditions. > > > > 34 files changed, 668 insertions(+), 1170 deletions(-) > > > Hi Miklos, > this looks like a good reduction in code duplication (-500 LOC is nice). > > One issues I have with this patch is that it spreads the use of "fsxattr" > asthe name for these inode flags further into the VFS. That was inherited > from XFS because of the ioctl name, but is very confusing with "real" > extended attributes, also using get/setxattr names but totally differently. > > It would be better to use function/variable names with "xflags" and "iflags" > that are already used in several filesystems to separate this from xattrs. Makes sense. > > It wouldn't be terrible to also rename the ioctl to FS_IOC_FSSETXFLAGS and > keep a #define for FS_IOC_FSSETXATTR for compatibility, but that is less > critical if that is now only used in one place in the code. > > Some more comments inline... > > > --- a/fs/ioctl.c > > +++ b/fs/ioctl.c > > +/* > > + * Generic function to check FS_IOC_FSSETXATTR/FS_IOC_SETFLAGS values and reject > > + * any invalid configurations. > > + * > > + * Note: must be called with inode lock held. > > + */ > > +static int fssetxattr_prepare(struct inode *inode, > > + const struct kfsxattr *old_fa, > > + struct kfsxattr *fa) > > > +{ > > + /* > > + * The IMMUTABLE and APPEND_ONLY flags can only be changed by > > + * the relevant capability. > > + */ > > + if ((fa->fsx_flags ^ old_fa->fsx_flags) & (FS_APPEND_FL | FS_IMMUTABLE_FL) && > > + !capable(CAP_LINUX_IMMUTABLE)) > > + return -EPERM; > > + > > + if (fa->flags_valid) > > + return fscrypt_prepare_setflags(inode, old_fa->fsx_flags, fa->fsx_flags); > > This doesn't seem right? It means if iflags are set, the rest of the checks are > skipped *even if* there are no problems with the fscrypt flags? That bypasses > the DAX and PROJINHERIT checks later in this function, but it is also possible to > set/clear those flags via FS_IOC_SETFLAGS, and is not very obvious for the code > flow. I'd think this should be something more like: > > if (IS_ENCRYPTED(inode) && fa->flags_valid) { > rc = fscrypt_prepare_setflags(...); > if (rc) > return rc; > } > > and continue on to the rest of the checks, and maybe skip the xflags-specific > checks (EXTSIZE, EXTSZINHERIT, COWEXTSIZE) if xflags_valid is not set, though > they would just be no-ops in that case since the iflags interface will not > set those flags. > > Alternately, move the DAX and PROJINHERIT checks above "flags_valid", but add > a comment that the remaining checks are only for xflags-specific values. Good point. I wasn't actually looking at the code, just converting the old one to the new, and apparently the DAX and PROJINHERIT checks were missing for the FS_IOC_SETFLAGS case (see vfs_ioc_setflags_prepare). > > +int vfs_fssetxattr(struct dentry *dentry, struct kfsxattr *fa) > > +{ > > + struct inode *inode = d_inode(dentry); > > + struct kfsxattr old_fa; > > + int err; > > + > > + if (d_is_special(dentry)) > > + return -ENOTTY; > > + > > + if (!inode->i_op->fssetxattr) > > + return -ENOIOCTLCMD; > > + > > + if (!inode_owner_or_capable(inode)) > > + return -EPERM; > > + > > + inode_lock(inode); > > + err = vfs_fsgetxattr(dentry, &old_fa); > > + if (!err) { > > + /* initialize missing bits from old_fa */ > > + if (fa->flags_valid) { > > + fa->fsx_xflags |= old_fa.fsx_xflags & ~FS_XFLAG_COMMON; > > + fa->fsx_extsize = old_fa.fsx_extsize; > > + fa->fsx_nextents = old_fa.fsx_nextents; > > + fa->fsx_projid = old_fa.fsx_projid; > > + fa->fsx_cowextsize = old_fa.fsx_cowextsize; > > + } else { > > + fa->fsx_flags |= old_fa.fsx_flags & ~FS_COMMON_FL; > > + } > > This extra call to vfs_fsgetxattr() is adding pure overhead for the case of > FS_IOC_GETFLAGS and is totally unnecessary. If iflags_valid is set, then > none of these other fields should be accessed in the ->fssetxattr() method, > and they can check for iflags_valid vs. xflags_valid themselves to see which > ioctl is being called and only access fields which are valid. The extra call is needed for fssetxattr_prepare() for the the checks anyway. Filling in the other fields is a bonus that makes fs code simpler in some cases (e.g. xfs which mainly just looks at xflags). Also ->fsgetxattr() is cheap for all filesystems (and this syscall shouldn't be performance sensitive anyway) so I don't see why we'd need to optimize this at all. Thanks, Miklos