On Mon, 01 Jul 2024, Jeff Layton wrote: > On Sun, 2024-06-30 at 16:15 -0400, Chuck Lever wrote: > > On Sun, Jun 30, 2024 at 03:59:58PM -0400, Jeff Layton wrote: > > > On Sun, 2024-06-30 at 15:55 -0400, Chuck Lever wrote: > > > > On Sun, Jun 30, 2024 at 03:52:51PM -0400, Jeff Layton wrote: > > > > > On Sun, 2024-06-30 at 15:44 -0400, Mike Snitzer wrote: > > > > > > On Sun, Jun 30, 2024 at 10:49:51AM -0400, Chuck Lever wrote: > > > > > > > On Sat, Jun 29, 2024 at 06:18:42PM -0400, Chuck Lever wrote: > > > > > > > > > + > > > > > > > > > + /* nfs_fh -> svc_fh */ > > > > > > > > > + if (nfs_fh->size > NFS4_FHSIZE) { > > > > > > > > > + status = -EINVAL; > > > > > > > > > + goto out; > > > > > > > > > + } > > > > > > > > > + fh_init(&fh, NFS4_FHSIZE); > > > > > > > > > + fh.fh_handle.fh_size = nfs_fh->size; > > > > > > > > > + memcpy(fh.fh_handle.fh_raw, nfs_fh->data, nfs_fh->size); > > > > > > > > > + > > > > > > > > > + if (fmode & FMODE_READ) > > > > > > > > > + mayflags |= NFSD_MAY_READ; > > > > > > > > > + if (fmode & FMODE_WRITE) > > > > > > > > > + mayflags |= NFSD_MAY_WRITE; > > > > > > > > > + > > > > > > > > > + beres = nfsd_file_acquire(rqstp, &fh, mayflags, &nf); > > > > > > > > > + if (beres) { > > > > > > > > > + status = nfs_stat_to_errno(be32_to_cpu(beres)); > > > > > > > > > + goto out_fh_put; > > > > > > > > > + } > > > > > > > > > > > > > > > > So I'm wondering whether just calling fh_verify() and then > > > > > > > > nfsd_open_verified() would be simpler and/or good enough here. Is > > > > > > > > there a strong reason to use the file cache for locally opened > > > > > > > > files? Jeff, any thoughts? > > > > > > > > > > > > > > > Will there be writeback ramifications for > > > > > > > > doing this? Maybe we need a comment here explaining how these files > > > > > > > > are garbage collected (just an fput by the local I/O client, I would > > > > > > > > guess). > > > > > > > > > > > > > > OK, going back to this: Since right here we immediately call > > > > > > > > > > > > > > nfsd_file_put(nf); > > > > > > > > > > > > > > There are no writeback ramifications nor any need to comment about > > > > > > > garbage collection. But this still seems like a lot of (possibly > > > > > > > unnecessary) overhead for simply obtaining a struct file. > > > > > > > > > > > > Easy enough change, probably best to avoid the filecache but would like > > > > > > to verify with Jeff before switching: > > > > > > > > > > > > diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c > > > > > > index 1d6508aa931e..85ebf63789fb 100644 > > > > > > --- a/fs/nfsd/localio.c > > > > > > +++ b/fs/nfsd/localio.c > > > > > > @@ -197,7 +197,6 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_net, > > > > > > const struct cred *save_cred; > > > > > > struct svc_rqst *rqstp; > > > > > > struct svc_fh fh; > > > > > > - struct nfsd_file *nf; > > > > > > __be32 beres; > > > > > > > > > > > > if (nfs_fh->size > NFS4_FHSIZE) > > > > > > @@ -235,13 +234,12 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_net, > > > > > > if (fmode & FMODE_WRITE) > > > > > > mayflags |= NFSD_MAY_WRITE; > > > > > > > > > > > > - beres = nfsd_file_acquire(rqstp, &fh, mayflags, &nf); > > > > > > + beres = fh_verify(rqstp, &fh, S_IFREG, mayflags); > > > > > > if (beres) { > > > > > > status = nfs_stat_to_errno(be32_to_cpu(beres)); > > > > > > goto out_fh_put; > > > > > > } > > > > > > - *pfilp = get_file(nf->nf_file); > > > > > > - nfsd_file_put(nf); > > > > > > + status = nfsd_open_verified(rqstp, &fh, mayflags, pfilp); > > > > > > out_fh_put: > > > > > > fh_put(&fh); > > > > > > nfsd_local_fakerqst_destroy(rqstp); > > > > > > > > > > > > > > > > My suggestion would be to _not_ do this. I think you do want to use the > > > > > filecache (mostly for performance reasons). > > > > > > > > But look carefully: > > > > > > > > -- we're not calling nfsd_file_acquire_gc() here > > > > > > > > -- we're immediately calling nfsd_file_put() on the returned nf > > > > > > > > There's nothing left in the file cache when nfsd_open_local_fh() > > > > returns. Each call here will do a full file open and a full close. > > > > > > Good point. This should be calling nfsd_file_acquire_gc(), IMO. > > > > So that goes to my point yesterday about writeback ramifications. > > > > If these open files linger in the file cache, then when will they > > get written back to storage and by whom? Is it going to be an nfsd > > thread writing them back as part of garbage collection? > > > > Usually the client is issuing regular COMMITs. If that doesn't happen, > then the flusher threads should get the rest. > > Side note: I don't guess COMMIT goes over the localio path yet, does > it? Maybe it should. It would be nice to not tie up an nfsd thread with > writeback. The documentation certainly claims that COMMIT uses the localio path. I haven't double checked the code but I'd be very surprised if it didn't. NeilBrown > > > So, you're saying that the local I/O client will always behave like > > NFSv3 in this regard, and open/read/close, open/write/close instead > > of hanging on to the open file? That seems... suboptimal... and not > > expected for a local file. That needs to be documented in the > > LOCALIO design doc. > > > > I imagine so, which is why I suggest using the filecache. If we get one > READ or WRITE for the file via localio, we're pretty likely to get > more. Why not amortize that file open over several operations? > > > I'm also concerned about local applications closing a file but > > having an open file handle linger in the file cache -- that can > > prevent other accesses to the file until the GC ejects that open > > file, as we've seen in the field. > > > > IMHO nfsd_file_acquire_gc() is going to have some unwanted side > > effects. > > > > Typically, the client issues COMMIT calls when the client-side fd is > closed (for CTO). While I think we do need to be able to deal with > flushing files with dirty data that are left "hanging", that shouldn't > be the common case. Most of the time, the client is going to be issuing > regular COMMITs so that it can clean its pages. > > IOW, I don't see how localio is any different than the case of normal > v3 IO in this respect. > -- > Jeff Layton <jlayton@xxxxxxxxxx> >