On Fri, Jun 21, 2024 at 04:08:20PM +1000, NeilBrown wrote: > On Thu, 20 Jun 2024, Mike Snitzer wrote: > > From: Weston Andros Adamson <dros@xxxxxxxxxxxxxxx> > > > > Add client support for bypassing NFS for localhost reads, writes, and > > commits. This is only useful when the client and the server are > > running on the same host. > > > > nfs_local_probe() is stubbed out, later commits will enable client and > > server handshake via a Linux-only LOCALIO auxiliary RPC protocol. > > > > This has dynamic binding with the nfsd module (via nfs_localio module > > which is part of nfs_common). Localio will only work if nfsd is > > already loaded. > > > > The "localio_enabled" nfs kernel module parameter can be used to > > disable and enable the ability to use localio support. > > > > Tracepoints were added for nfs_local_open_fh, nfs_local_enable and > > nfs_local_disable. > > > > Also, pass the stored cl_nfssvc_net from the client to the server as > > first argument to nfsd_open_local_fh() to ensure the proper network > > namespace is used for localio. > > > > Signed-off-by: Weston Andros Adamson <dros@xxxxxxxxxxxxxxx> > > Signed-off-by: Peng Tao <tao.peng@xxxxxxxxxxxxxxx> > > Signed-off-by: Lance Shelton <lance.shelton@xxxxxxxxxxxxxxx> > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> > > --- > > fs/nfs/Makefile | 1 + > > fs/nfs/client.c | 3 + > > fs/nfs/inode.c | 4 + > > fs/nfs/internal.h | 51 +++ > > fs/nfs/localio.c | 722 ++++++++++++++++++++++++++++++++++++++ > > fs/nfs/nfstrace.h | 61 ++++ > > fs/nfs/pagelist.c | 3 + > > fs/nfs/write.c | 3 + > > fs/nfsd/Makefile | 1 + > > fs/nfsd/filecache.c | 2 +- > > fs/nfsd/localio.c | 244 +++++++++++++ > > fs/nfsd/nfssvc.c | 1 + > > fs/nfsd/trace.h | 3 +- > > fs/nfsd/vfs.h | 9 + > > include/linux/nfs.h | 2 + > > include/linux/nfs_fs.h | 2 + > > include/linux/nfs_fs_sb.h | 1 + > > include/linux/nfs_xdr.h | 1 + > > 18 files changed, 1112 insertions(+), 2 deletions(-) > > create mode 100644 fs/nfs/localio.c > > create mode 100644 fs/nfsd/localio.c > > <snip> > > diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c > > new file mode 100644 > > index 000000000000..38d0832442b2 > > --- /dev/null > > +++ b/fs/nfs/localio.c <snip> > > +static void > > +nfs_local_iter_init(struct iov_iter *i, struct nfs_local_kiocb *iocb, int dir) > > +{ > > + struct nfs_pgio_header *hdr = iocb->hdr; > > + > > + if (hdr->args.pgbase != 0) { > > + iov_iter_bvec(i, dir, iocb->bvec, > > + hdr->page_array.npages, > > + hdr->args.count + hdr->args.pgbase); > > + iov_iter_advance(i, hdr->args.pgbase); > > + } else > > + iov_iter_bvec(i, dir, iocb->bvec, > > + hdr->page_array.npages, hdr->args.count); > > Both branches of this if() do exactly the same thing. iov_iter_advance > is a no-op if the size arg is zero. iov_iter_advance doesn't look to be a no-op if the size arg is zero. > Is it really worth increasing the code size to sometimes avoid a > function call? > > At least we should for the iov_iter_bvec() inconditionally, then maybe > call _advance(). For v7, I've fixed it so we do what you suggest. > > +/* > > + * Complete the I/O from iocb->kiocb.ki_complete() > > + * > > + * Note that this function can be called from a bottom half context, > > + * hence we need to queue the fput() etc to a workqueue > > fput() is not a good excuse for a workqueue - the work is always > deferred either to a workqueue or to a process return-from-syscall > context. > However the ->rpc_call_done() and vfs_fsync_range() calls are excellent > justification for a workqueue. > So I think the comment should be improved, but the code looks sensible. OK. > > +static void > > +nfs_get_vfs_attr(struct file *filp, struct nfs_fattr *fattr) > > +{ > > + struct kstat stat; > > + > > + if (fattr != NULL && vfs_getattr(&filp->f_path, &stat, > > + STATX_INO | > > + STATX_ATIME | > > + STATX_MTIME | > > + STATX_CTIME | > > + STATX_SIZE | > > + STATX_BLOCKS, > > + AT_STATX_SYNC_AS_STAT) == 0) { > > + fattr->valid = NFS_ATTR_FATTR_FILEID | > > + NFS_ATTR_FATTR_CHANGE | > > + NFS_ATTR_FATTR_SIZE | > > + NFS_ATTR_FATTR_ATIME | > > + NFS_ATTR_FATTR_MTIME | > > + NFS_ATTR_FATTR_CTIME | > > + NFS_ATTR_FATTR_SPACE_USED; > > + fattr->fileid = stat.ino; > > + fattr->size = stat.size; > > + fattr->atime = stat.atime; > > + fattr->mtime = stat.mtime; > > + fattr->ctime = stat.ctime; > > + fattr->change_attr = nfs_timespec_to_change_attr(&fattr->ctime); > > This looks wrong for NFSv4. I think we should use > nfsd4_change_attribute(). > Maybe it isn't important, but if it isn't I'd like to see an explanation > why. > > > + fattr->du.nfs3.used = stat.blocks << 9; > > + } > > +} Not following, this is client code so it doesn't have access to nfsd4_change_attribute(). Pending clarification, and further review on my part, leaving this item to one side (so won't be addressed in v7). Thanks, Mike