Re: Re: Re: [PATCH v2 2/4] fs: add FS_IOC_FSSETXATTRAT and FS_IOC_FSGETXATTRAT

Amir Goldstein <amir73il@xxxxxxxxx> · Tue, 11 Jun 2024 10:57:29 +0300

On Mon, Jun 10, 2024 at 11:26 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
>
> On Mon, Jun 10, 2024 at 04:21:39PM +0300, Amir Goldstein wrote:
> > On Mon, Jun 10, 2024 at 2:50 PM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote:
> > >
> > > On 2024-06-10 12:19:50, Amir Goldstein wrote:
> > > > On Mon, Jun 10, 2024 at 11:17 AM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote:
> > > > >
> > > > > On 2024-06-06 12:27:38, Dave Chinner wrote:
> > > > ...
> > > > > >
> > > > > > The only reason XFS returns -EXDEV to rename across project IDs is
> > > > > > because nobody wanted to spend the time to work out how to do the
> > > > > > quota accounting of the metadata changed in the rename operation
> > > > > > accurately. So for that rare case (not something that would happen
> > > > > > on the NAS product) we returned -EXDEV to trigger the mv command to
> > > > > > copy the file to the destination and then unlink the source instead,
> > > > > > thereby handling all the quota accounting correctly.
> > > > > >
> > > > > > IOWs, this whole "-EXDEV on rename across parent project quota
> > > > > > boundaries" is an implementation detail and nothing more.
> > > > > > Filesystems that implement project quotas and the directory tree
> > > > > > sub-variant don't need to behave like this if they can accurately
> > > > > > account for the quota ID changes during an atomic rename operation.
> > > > > > If that's too hard, then the fallback is to return -EXDEV and let
> > > > > > userspace do it the slow way which will always acocunt the resource
> > > > > > usage correctly to the individual projects.
> > > > > >
> > > > > > Hence I think we should just fix the XFS kernel behaviour to do the
> > > > > > right thing in this special file case rather than return -EXDEV and
> > > > > > then forget about the rest of it.
> > > > >
> > > > > I see, I will look into that, this should solve the original issue.
> > > >
> > > > I see that you already got Darrick's RVB on the original patch:
> > > > https://lore.kernel.org/linux-xfs/20240315024826.GA1927156@frogsfrogsfrogs/
> > > >
> > > > What is missing then?
> > > > A similar patch for rename() that allows rename of zero projid special
> > > > file as long as (target_dp->i_projid == src_dp->i_projid)?
> > > >
> > > > In theory, it would have been nice to fix the zero projid during the
> > > > above link() and rename() operations, but it would be more challenging
> > > > and I see no reason to do that if all the other files remain with zero
> > > > projid after initial project setup (i.e. if not implementing the syscalls).
> > >
> > > I think Dave suggests to get rid of this if-guard and allow
> > > link()/rename() for special files but with correct quota calculation.
> > >
> > > >
> > > > >
> > > > > But those special file's inodes still will not be accounted by the
> > > > > quota during initial project setup (xfs_quota will skip them), would
> > > > > it worth it adding new syscalls anyway?
> > > > >
> > > >
> > > > Is it worth it to you?
> > > >
> > > > Adding those new syscalls means adding tests and documentation
> > > > and handle all the bugs later.
> > > >
> > > > If nobody cared about accounting of special files inodes so far,
> > > > there is no proof that anyone will care that you put in all this work.
> > >
> > > I already have patch and some simple man-pages prepared, I'm
> > > wondering if this would be useful for any other usecases
> >
> > Yes, I personally find it useful.
> > I have applications that query the fsx_xflags and would rather
> > be able to use O_PATH to query/set those flags, since
> > internally in vfs, fileattr_[gs]et() do not really need an open file.
> >
> > > which would
> > > require setting extended attributes on spec indodes.
> >
> > Please do not use the terminology "extended attributes" in the man page
> > to describe struct fsxattr.
>
> "XFS file attributes" perhaps?
>
> Though that's anachronistic since ext4 supports /some/ of them now.
>

Technically, it's all the filesystems that support ->fileattr_[gs]et(),
which is a lot.

Since the feature has matured into a vfs feature with planned new syscalls,
the XFS ioctl man page could be referenced for historic perspective and
for explaining the reason for the X in fsxattr, but I don't think that
the official
name of those attributes should include "XFS".

Maybe TAFKAXFA (The attributes formerly known as XFS file attributes) :-p

Also, adding a syscall to perform FS_IOC_FS[GS]ETXATTR, without
being able to perform FS_IOC_[GS]ETFLAGS would be a bit odd IMO.
It is possible to multiplex both in the same syscall.
For example, let the syscall return the size of the attributes, like getxattr().
Not sure whether it is smart to do it.

Then again, not sure if this was already mentioned as an option, besides
the more generic proposal by Miklos [1], but we could bind generic xattr
handlers for "system.fs.attr" and "system.fs.flags".
API-wise, ->fileattr_[gs]et() would become very similar to ->[gs]et_acl(),
which are also get/set via "system.*" xattr.

The added benefit is that we avoid the confusing distinctions between
"additional attributes" and "extended attributes" -
"additional file attributes/flags" are a private case of "extended attributes".

Implementation wise it should be pretty simple, but I agree with Jan that
the question of credentials needed for these syscalls needs to be
addressed first and get Christian's blessing.

Thanks,
Amir.

[1] https://lore.kernel.org/linux-fsdevel/YnEeuw6fd1A8usjj@xxxxxxxxxxxxxxxxxxxxxxxxx/