On Tue, May 21, 2024 at 7:34 PM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote: > > On 2024-05-20 22:03:43, Amir Goldstein wrote: > > On Mon, May 20, 2024 at 7:46 PM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote: > > > > > > XFS has project quotas which could be attached to a directory. All > > > new inodes in these directories inherit project ID set on parent > > > directory. > > > > > > The project is created from userspace by opening and calling > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special > > > files such as FIFO, SOCK, BLK etc. as opening them returns a special > > > inode from VFS. Therefore, some inodes are left with empty project > > > ID. Those inodes then are not shown in the quota accounting but > > > still exist in the directory. > > > > > > This patch adds two new ioctls which allows userspace, such as > > > xfs_quota, to set project ID on special files by using parent > > > directory to open FS inode. This will let xfs_quota set ID on all > > > inodes and also reset it when project is removed. Also, as > > > vfs_fileattr_set() is now will called on special files too, let's > > > forbid any other attributes except projid and nextents (symlink can > > > have one). > > > > > > Signed-off-by: Andrey Albershteyn <aalbersh@xxxxxxxxxx> > > > --- > > > fs/ioctl.c | 93 +++++++++++++++++++++++++++++++++++++++++ > > > include/uapi/linux/fs.h | 11 +++++ > > > 2 files changed, 104 insertions(+) > > > > > > diff --git a/fs/ioctl.c b/fs/ioctl.c > > > index 1d5abfdf0f22..3e3aacb6ea6e 100644 > > > --- a/fs/ioctl.c > > > +++ b/fs/ioctl.c > > > @@ -22,6 +22,7 @@ > > > #include <linux/mount.h> > > > #include <linux/fscrypt.h> > > > #include <linux/fileattr.h> > > > +#include <linux/namei.h> > > > > > > #include "internal.h" > > > > > > @@ -647,6 +648,19 @@ static int fileattr_set_prepare(struct inode *inode, > > > if (fa->fsx_cowextsize == 0) > > > fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; > > > > > > + /* > > > + * The only use case for special files is to set project ID, forbid any > > > + * other attributes > > > + */ > > > + if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))) { > > > + if (fa->fsx_xflags & ~FS_XFLAG_PROJINHERIT) > > > + return -EINVAL; > > > + if (!S_ISLNK(inode->i_mode) && fa->fsx_nextents) > > > + return -EINVAL; > > > + if (fa->fsx_extsize || fa->fsx_cowextsize) > > > + return -EINVAL; > > > + } > > > + > > > return 0; > > > } > > > > > > @@ -763,6 +777,79 @@ static int ioctl_fssetxattr(struct file *file, void __user *argp) > > > return err; > > > } > > > > > > +static int ioctl_fsgetxattrat(struct file *file, void __user *argp) > > > +{ > > > + struct path filepath; > > > + struct fsxattrat fsxat; > > > + struct fileattr fa; > > > + int error; > > > + > > > + if (!S_ISDIR(file_inode(file)->i_mode)) > > > + return -EBADF; > > > > So the *only* thing that is done with the fd of the ioctl is to verify > > that it is a directory fd - there is no verification that this fd is on the > > same sb as the path to act on. > > > > Was this the intention? It does not make a lot of sense to me > > and AFAIK there is no precedent to an API like this. > > yeah, as we want to set xattrs on that inode the path is pointing > to, so, VFS will call to the FS under that sb. > > > > > There are ioctls that operate on the filesystem using any > > fd on that fs, such as FS_IOC_GETFS{UUID,SYSFSPATH} > > and maybe the closest example to what you are trying to add > > XFS_IOC_BULKSTAT. > > Not sure that I get what you mean here, the *at() part is to get > around VFS special inodes and call vfs_fileattr_set/get on FS inodes. > My point was that with your proposed API the fd argument to ioctl() can be a directory from a completely arbitrary filesystem with nothing to do with the filesystem where fsxat.dfd is from and that makes very little sense from API POV. > > > > Trying to think of a saner API for this - perhaps pass an O_PATH > > fd without any filename in struct fsxattrat, saving you also the > > headache of passing a variable length string in an ioctl. > > > > Then atfile = fdget_raw(fsxat.atfd) and verify that atfile->f_path > > and file->f_path are on the same sb before proceeding to operate > > on atfile->f_path.dentry. > > Thanks! Didn't know about O_PATH that seems to be a way to get rid > of the path passing. > I found one precedent of this pattern with XFS_IOC_FD_TO_HANDLE, but keep in mind that this is quite an old legacy XFS API. This ioctl is performed on an "fshandle", which is an open fd to any object in the filesystem, but typically the mount root dir. The ioctl gets a structure xfs_fsop_handlereq_t which contains an fd member pointing to another object within an XFS filesystem. Actually, AFAICS, this code does not verify that the object and fshandle are on the same XFS filesystem, but only that both are on XFS filesystems: /* * We can only generate handles for inodes residing on a XFS filesystem, * and only for regular files, directories or symbolic links. */ error = -EINVAL; if (inode->i_sb->s_magic != XFS_SB_MAGIC) goto out_put; I don't know what's the best thing to do is, but I think that verifying that the ioctl fd and the O_PATH fd are on the same sb is the least controversial option for the first version - if needed that could be relaxed later on. Another alternative which is simpler from API POV would be to allow selective ioctl() commands on an O_PATH fd, but I think that is going to be more controversial. Something along those lines (completely untested): diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 45e4e64fd664..562f8bff91d2 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -241,6 +241,13 @@ struct fsxattr { */ #define FS_IOC_GETFSSYSFSPATH _IOR(0x15, 1, struct fs_sysfs_path) +#define _IOC_AT (0x100) +#define FS_IOC_AT(nr) (_IOC_TYPE(nr) == _IOC_AT) + +/* The following ioctls can be operated on an O_PATH fd */ +#define FS_IOC_FSGETXATTRAT _IOR(_IOC_AT, 31, struct fsxattr) +#define FS_IOC_FSSETXATTRAT _IOW(_IOC_AT, 32, struct fsxattr) + /* * Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS) * diff --git a/fs/ioctl.c b/fs/ioctl.c index 1d5abfdf0f22..f720500c705b 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -867,9 +867,11 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd, return ioctl_setflags(filp, argp); case FS_IOC_FSGETXATTR: + case FS_IOC_FSGETXATTRAT: return ioctl_fsgetxattr(filp, argp); case FS_IOC_FSSETXATTR: + case FS_IOC_FSSETXATTRAT: return ioctl_fssetxattr(filp, argp); case FS_IOC_GETFSUUID: @@ -879,7 +881,7 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd, return ioctl_get_fs_sysfs_path(filp, argp); default: - if (S_ISREG(inode->i_mode)) + if (!FS_IOC_AT(cmd) && S_ISREG(inode->i_mode)) return file_ioctl(filp, cmd, argp); break; } @@ -889,7 +891,8 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd, SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg) { - struct fd f = fdget(fd); + bool ioc_at = FS_IOC_AT(cmd); + struct fd f = ioc_at ? fdget_raw(fd) : fdget(fd); int error; if (!f.file) @@ -900,7 +903,7 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg) goto out; error = do_vfs_ioctl(f.file, fd, cmd, arg); - if (error == -ENOIOCTLCMD) + if (!ioc_at && error == -ENOIOCTLCMD) error = vfs_ioctl(f.file, cmd, arg); --- Thanks, Amir.