On Wed, May 22, 2024 at 04:58:31PM +0200, Andrey Albershteyn wrote: > On 2024-05-21 21:22:41, Amir Goldstein wrote: > > On Tue, May 21, 2024 at 7:34 PM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote: > > > > > > On 2024-05-20 22:03:43, Amir Goldstein wrote: > > > > On Mon, May 20, 2024 at 7:46 PM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote: > > > > > > > > > > XFS has project quotas which could be attached to a directory. All > > > > > new inodes in these directories inherit project ID set on parent > > > > > directory. > > > > > > > > > > The project is created from userspace by opening and calling > > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > > > > files such as FIFO, SOCK, BLK etc. as opening them returns a special > > > > > inode from VFS. Therefore, some inodes are left with empty project > > > > > ID. Those inodes then are not shown in the quota accounting but > > > > > still exist in the directory. > > > > > > > > > > This patch adds two new ioctls which allows userspace, such as > > > > > xfs_quota, to set project ID on special files by using parent > > > > > directory to open FS inode. This will let xfs_quota set ID on all > > > > > inodes and also reset it when project is removed. Also, as > > > > > vfs_fileattr_set() is now will called on special files too, let's > > > > > forbid any other attributes except projid and nextents (symlink can > > > > > have one). > > > > > > > > > > Signed-off-by: Andrey Albershteyn <aalbersh@xxxxxxxxxx> > > > > > --- > > > > > fs/ioctl.c | 93 +++++++++++++++++++++++++++++++++++++++++ > > > > > include/uapi/linux/fs.h | 11 +++++ > > > > > 2 files changed, 104 insertions(+) > > > > > > > > > > diff --git a/fs/ioctl.c b/fs/ioctl.c > > > > > index 1d5abfdf0f22..3e3aacb6ea6e 100644 > > > > > --- a/fs/ioctl.c > > > > > +++ b/fs/ioctl.c > > > > > @@ -22,6 +22,7 @@ > > > > > #include <linux/mount.h> > > > > > #include <linux/fscrypt.h> > > > > > #include <linux/fileattr.h> > > > > > +#include <linux/namei.h> > > > > > > > > > > #include "internal.h" > > > > > > > > > > @@ -647,6 +648,19 @@ static int fileattr_set_prepare(struct inode *inode, > > > > > if (fa->fsx_cowextsize == 0) > > > > > fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; > > > > > > > > > > + /* > > > > > + * The only use case for special files is to set project ID, forbid any > > > > > + * other attributes > > > > > + */ > > > > > + if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))) { > > > > > + if (fa->fsx_xflags & ~FS_XFLAG_PROJINHERIT) > > > > > + return -EINVAL; > > > > > + if (!S_ISLNK(inode->i_mode) && fa->fsx_nextents) > > > > > + return -EINVAL; > > > > > + if (fa->fsx_extsize || fa->fsx_cowextsize) > > > > > + return -EINVAL; > > > > > + } > > > > > + > > > > > return 0; > > > > > } > > > > > > > > > > @@ -763,6 +777,79 @@ static int ioctl_fssetxattr(struct file *file, void __user *argp) > > > > > return err; > > > > > } > > > > > > > > > > +static int ioctl_fsgetxattrat(struct file *file, void __user *argp) > > > > > +{ > > > > > + struct path filepath; > > > > > + struct fsxattrat fsxat; > > > > > + struct fileattr fa; > > > > > + int error; > > > > > + > > > > > + if (!S_ISDIR(file_inode(file)->i_mode)) > > > > > + return -EBADF; > > > > > > > > So the *only* thing that is done with the fd of the ioctl is to verify > > > > that it is a directory fd - there is no verification that this fd is on the > > > > same sb as the path to act on. > > > > > > > > Was this the intention? It does not make a lot of sense to me > > > > and AFAIK there is no precedent to an API like this. > > > > > > yeah, as we want to set xattrs on that inode the path is pointing > > > to, so, VFS will call to the FS under that sb. > > > > > > > > > > > There are ioctls that operate on the filesystem using any > > > > fd on that fs, such as FS_IOC_GETFS{UUID,SYSFSPATH} > > > > and maybe the closest example to what you are trying to add > > > > XFS_IOC_BULKSTAT. > > > > > > Not sure that I get what you mean here, the *at() part is to get > > > around VFS special inodes and call vfs_fileattr_set/get on FS inodes. > > > > > > > My point was that with your proposed API the fd argument to > > ioctl() can be a directory from a completely arbitrary filesystem > > with nothing to do with the filesystem where fsxat.dfd is from > > and that makes very little sense from API POV. > > I see, yeah, I can add a check for this, for xfs_quota usecase it's > fine as it doesn't follow cross mounts. Do the other *at() syscalls prohibit dfd + path pointing to a different filesystem? It seems odd to have this restriction that the rest don't, but perhaps documenting this in the ioctl_xfs_fsgetxattrat manpage is ok. --D > > > > > > > > > > Trying to think of a saner API for this - perhaps pass an O_PATH > > > > fd without any filename in struct fsxattrat, saving you also the > > > > headache of passing a variable length string in an ioctl. > > > > > > > > Then atfile = fdget_raw(fsxat.atfd) and verify that atfile->f_path > > > > and file->f_path are on the same sb before proceeding to operate > > > > on atfile->f_path.dentry. > > > > > > Thanks! Didn't know about O_PATH that seems to be a way to get rid > > > of the path passing. > > > > > > > I found one precedent of this pattern with XFS_IOC_FD_TO_HANDLE, > > but keep in mind that this is quite an old legacy XFS API. > > > > This ioctl is performed on an "fshandle", which is an open fd to any > > object in the filesystem, but typically the mount root dir. > > The ioctl gets a structure xfs_fsop_handlereq_t which contains an > > fd member pointing to another object within an XFS filesystem. > > > > Actually, AFAICS, this code does not verify that the object and fshandle > > are on the same XFS filesystem, but only that both are on XFS filesystems: > > > > /* > > * We can only generate handles for inodes residing on a XFS filesystem, > > * and only for regular files, directories or symbolic links. > > */ > > error = -EINVAL; > > if (inode->i_sb->s_magic != XFS_SB_MAGIC) > > goto out_put; > > > > I don't know what's the best thing to do is, but I think that verifying > > that the ioctl fd and the O_PATH fd are on the same sb is the least > > controversial option for the first version - if needed that could be > > relaxed later on. > > > > Another alternative which is simpler from API POV would be to allow > > selective ioctl() commands on an O_PATH fd, but I think that is going > > to be more controversial. > > > > Something along those lines (completely untested): > > > > diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h > > index 45e4e64fd664..562f8bff91d2 100644 > > --- a/include/uapi/linux/fs.h > > +++ b/include/uapi/linux/fs.h > > @@ -241,6 +241,13 @@ struct fsxattr { > > */ > > #define FS_IOC_GETFSSYSFSPATH _IOR(0x15, 1, struct fs_sysfs_path) > > > > +#define _IOC_AT (0x100) > > +#define FS_IOC_AT(nr) (_IOC_TYPE(nr) == _IOC_AT) > > + > > +/* The following ioctls can be operated on an O_PATH fd */ > > +#define FS_IOC_FSGETXATTRAT _IOR(_IOC_AT, 31, struct fsxattr) > > +#define FS_IOC_FSSETXATTRAT _IOW(_IOC_AT, 32, struct fsxattr) > > + > > /* > > * Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS) > > * > > diff --git a/fs/ioctl.c b/fs/ioctl.c > > index 1d5abfdf0f22..f720500c705b 100644 > > --- a/fs/ioctl.c > > +++ b/fs/ioctl.c > > @@ -867,9 +867,11 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd, > > return ioctl_setflags(filp, argp); > > > > case FS_IOC_FSGETXATTR: > > + case FS_IOC_FSGETXATTRAT: > > return ioctl_fsgetxattr(filp, argp); > > > > case FS_IOC_FSSETXATTR: > > + case FS_IOC_FSSETXATTRAT: > > return ioctl_fssetxattr(filp, argp); > > > > case FS_IOC_GETFSUUID: > > @@ -879,7 +881,7 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd, > > return ioctl_get_fs_sysfs_path(filp, argp); > > > > default: > > - if (S_ISREG(inode->i_mode)) > > + if (!FS_IOC_AT(cmd) && S_ISREG(inode->i_mode)) > > return file_ioctl(filp, cmd, argp); > > break; > > } > > @@ -889,7 +891,8 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd, > > > > SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg) > > { > > - struct fd f = fdget(fd); > > + bool ioc_at = FS_IOC_AT(cmd); > > + struct fd f = ioc_at ? fdget_raw(fd) : fdget(fd); > > int error; > > > > if (!f.file) > > @@ -900,7 +903,7 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned > > int, cmd, unsigned long, arg) > > goto out; > > > > error = do_vfs_ioctl(f.file, fd, cmd, arg); > > - if (error == -ENOIOCTLCMD) > > + if (!ioc_at && error == -ENOIOCTLCMD) > > error = vfs_ioctl(f.file, cmd, arg); > > > > --- > > > > Thanks, > > Amir. > > > > -- > - Andrey > >