Re: [PATCH v8 RESEND 3/8] vfs: plumb i_version handling into struct kstat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2023-01-25 at 16:50 +0100, Christian Brauner wrote:
> On Tue, Jan 24, 2023 at 02:30:20PM -0500, Jeff Layton wrote:
> > The NFS server has a lot of special handling for different types of
> > change attribute access, depending on the underlying filesystem. In
> > most cases, it's doing a getattr anyway and then fetching that value
> > after the fact.
> > 
> > Rather that do that, add a new STATX_CHANGE_COOKIE flag that is a
> > kernel-only symbol (for now). If requested and getattr can implement it,
> > it can fill out this field. For IS_I_VERSION inodes, add a generic
> > implementation in vfs_getattr_nosec. Take care to mask
> > STATX_CHANGE_COOKIE off in requests from userland and in the result
> > mask.
> > 
> > Since not all filesystems can give the same guarantees of monotonicity,
> > claim a STATX_ATTR_CHANGE_MONOTONIC flag that filesystems can set to
> > indicate that they offer an i_version value that can never go backward.
> > 
> > Eventually if we decide to make the i_version available to userland, we
> > can just designate a field for it in struct statx, and move the
> > STATX_CHANGE_COOKIE definition to the uapi header.
> > 
> > Reviewed-by: NeilBrown <neilb@xxxxxxx>
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> >  fs/stat.c            | 17 +++++++++++++++--
> >  include/linux/stat.h |  9 +++++++++
> >  2 files changed, 24 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/stat.c b/fs/stat.c
> > index d6cc74ca8486..f43afe0081fe 100644
> > --- a/fs/stat.c
> > +++ b/fs/stat.c
> > @@ -18,6 +18,7 @@
> >  #include <linux/syscalls.h>
> >  #include <linux/pagemap.h>
> >  #include <linux/compat.h>
> > +#include <linux/iversion.h>
> >  
> >  #include <linux/uaccess.h>
> >  #include <asm/unistd.h>
> > @@ -122,6 +123,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
> >  	stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT |
> >  				  STATX_ATTR_DAX);
> >  
> > +	if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) {
> > +		stat->result_mask |= STATX_CHANGE_COOKIE;
> > +		stat->change_cookie = inode_query_iversion(inode);
> > +	}
> > +
> >  	mnt_userns = mnt_user_ns(path->mnt);
> >  	if (inode->i_op->getattr)
> >  		return inode->i_op->getattr(mnt_userns, path, stat,
> > @@ -602,9 +608,11 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
> >  
> >  	memset(&tmp, 0, sizeof(tmp));
> >  
> > -	tmp.stx_mask = stat->result_mask;
> > +	/* STATX_CHANGE_COOKIE is kernel-only for now */
> > +	tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE;
> >  	tmp.stx_blksize = stat->blksize;
> > -	tmp.stx_attributes = stat->attributes;
> > +	/* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */
> > +	tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC;
> >  	tmp.stx_nlink = stat->nlink;
> >  	tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid);
> >  	tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid);
> > @@ -643,6 +651,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags,
> >  	if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE)
> >  		return -EINVAL;
> >  
> > +	/* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests
> > +	 * from userland.
> > +	 */
> > +	mask &= ~STATX_CHANGE_COOKIE;
> > +
> >  	error = vfs_statx(dfd, filename, flags, &stat, mask);
> >  	if (error)
> >  		return error;
> > diff --git a/include/linux/stat.h b/include/linux/stat.h
> > index ff277ced50e9..52150570d37a 100644
> > --- a/include/linux/stat.h
> > +++ b/include/linux/stat.h
> 
> Sorry being late to the party once again...
> 
> > @@ -52,6 +52,15 @@ struct kstat {
> >  	u64		mnt_id;
> >  	u32		dio_mem_align;
> >  	u32		dio_offset_align;
> > +	u64		change_cookie;
> >  };
> >  
> > +/* These definitions are internal to the kernel for now. Mainly used by nfsd. */
> > +
> > +/* mask values */
> > +#define STATX_CHANGE_COOKIE		0x40000000U	/* Want/got stx_change_attr */
> > +
> > +/* file attribute values */
> > +#define STATX_ATTR_CHANGE_MONOTONIC	0x8000000000000000ULL /* version monotonically increases */
> 
> maybe it would be better to copy what we do for SB_* vs SB_I_* flags and
> at least rename them to:
> 
> STATX_I_CHANGE_COOKIE
> STATX_I_ATTR_CHANGE_MONOTONIC
> i_change_cookie

An "i_"/"I_" prefix says "inode" to me. Maybe I've been at this too
long. ;)

> 
> to visually distinguish internal and external flags.
> 
> And also if possible it might be useful to move STATX_I_* flags to the
> higher 32 bits and then one can use upper_32_bits to retrieve kernel
> internal flags and lower_32_bits for userspace flags in tiny wrappers.
> 
> (I did something similar for clone3() a few years ago but there to
> distinguish between flags available both in clone() and clone3() and
> such that are only available in clone3().)
> 
> But just a thought. I mostly worry about accidently leaking this to
> userspace so ideally we'd even have separate fields in struct kstat for
> internal and external attributes but that might bump kstat size, though
> I don't think struct kstat is actually ever really allocated all that
> much.

I'm not sure that the internal/external distinction matters much for
filesystem providers or consumers of it. The place that it matters is at
the userland interface level, where statx or something similar is
called.

At some point we may want to make STATX_CHANGE_COOKIE queryable via
statx, at which point we'll have to have a big flag day where we do
s/STATX_I_CHANGE_COOKIE/STATX_CHANGE_COOKIE/.

I don't think it's worth it here.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux