Re: [PATCH v6 06/18] nfs/nfsd: add "localio" support

Mike Snitzer <snitzer@xxxxxxxxxx> · Tue, 25 Jun 2024 00:59:30 -0400

On Mon, Jun 24, 2024 at 08:27:39AM +1000, NeilBrown wrote:
> On Sat, 22 Jun 2024, Mike Snitzer wrote:
> > On Fri, Jun 21, 2024 at 04:08:20PM +1000, NeilBrown wrote:
> > > 
> > > Both branches of this if() do exactly the same thing.  iov_iter_advance
> > > is a no-op if the size arg is zero.
> > 
> > iov_iter_advance doesn't look to be a no-op if the size arg is zero.
> 
> void iov_iter_advance(struct iov_iter *i, size_t size)
> {
> 	if (unlikely(i->count < size))
> 		size = i->count;
> 	if (likely(iter_is_ubuf(i)) || unlikely(iov_iter_is_xarray(i))) {
> 		i->iov_offset += size;
> 		i->count -= size;
> 	} else if (likely(iter_is_iovec(i) || iov_iter_is_kvec(i))) {
> 		/* iovec and kvec have identical layouts */
> 		iov_iter_iovec_advance(i, size);
> 	} else if (iov_iter_is_bvec(i)) {
> 		iov_iter_bvec_advance(i, size);
> 	} else if (iov_iter_is_discard(i)) {
> 		i->count -= size;
> 	}
> }
> 
> This adds "size" to offset, and subtracts "size" from count.  For iovec
> and bvec it is a slightly complicated dance to achieve this, but that is
> the net result.
> So if "size" is zero there is no change to the iov_iter.  Just some
> wasted cycles.  Do those cycles justify the extra conditional branch?  I
> don't know.  I would generally prefer simpler code which is only
> optimised with evidence.  Admittedly I don't always follow that
> preference myself and I won't hold you to it.  But I thought the review
> would be incomplete without mentioning it.

OK, thanks.

> > > > +static void
> > > > +nfs_get_vfs_attr(struct file *filp, struct nfs_fattr *fattr)
> > > > +{
> > > > +	struct kstat stat;
> > > > +
> > > > +	if (fattr != NULL && vfs_getattr(&filp->f_path, &stat,
> > > > +					 STATX_INO |
> > > > +					 STATX_ATIME |
> > > > +					 STATX_MTIME |
> > > > +					 STATX_CTIME |
> > > > +					 STATX_SIZE |
> > > > +					 STATX_BLOCKS,
> > > > +					 AT_STATX_SYNC_AS_STAT) == 0) {
> > > > +		fattr->valid = NFS_ATTR_FATTR_FILEID |
> > > > +			NFS_ATTR_FATTR_CHANGE |
> > > > +			NFS_ATTR_FATTR_SIZE |
> > > > +			NFS_ATTR_FATTR_ATIME |
> > > > +			NFS_ATTR_FATTR_MTIME |
> > > > +			NFS_ATTR_FATTR_CTIME |
> > > > +			NFS_ATTR_FATTR_SPACE_USED;
> > > > +		fattr->fileid = stat.ino;
> > > > +		fattr->size = stat.size;
> > > > +		fattr->atime = stat.atime;
> > > > +		fattr->mtime = stat.mtime;
> > > > +		fattr->ctime = stat.ctime;
> > > > +		fattr->change_attr = nfs_timespec_to_change_attr(&fattr->ctime);
> > > 
> > > This looks wrong for NFSv4.  I think we should use
> > > nfsd4_change_attribute().
> > > Maybe it isn't important, but if it isn't I'd like to see an explanation
> > > why.
> > > 
> > > > +		fattr->du.nfs3.used = stat.blocks << 9;
> > > > +	}
> > > > +}
> > 
> > Not following, this is client code so it doesn't have access to
> > nfsd4_change_attribute().
> 
> This is nfs-localio code which blurs the boundary between server and
> client...
> 
> The change_attr is used by NFS to detect if a file might have changed.
> This code is used to get the attributes after a write request.
> NFS uses a GETATTR request to the server at other times.  The
> change_attr should be consistent between the two else comparisons will
> be meaningless.
> 
> So I think that nfs_get_vfs_attr() should use the same change_attr as
> the one that would be used if the NFS GETATTR request were made.
> For NFSv3, that is nfs_timespec_to_change_attr() as you have.
> For NFSv4 it is something different.
> 
> I think that having inconsistent change_attrs could cause NFS to flush
> its page cache unnecessarily.  As it can read directly from the
> server-side where is likely is cached, that might not be a problem.  If
> that reasoning does apply it should be explained.
> 
> However there is talk of exporting the "i_version" number to userspace
> through xattr.  For NFS that is essentially the change_attr. If we did
> that we would really want to keep the number consistent.
> 
> We could easily move nfsd4_change_attribute() into nfs_common or even
> make it an inline in some common include file.  It doesn't use any nfsd
> internals.

OK, makes sense, thanks for clarifying.  I'll fix it for v8.

Mike