> On Jun 21, 2021, at 2:00 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Mon, 2021-06-21 at 16:49 +0000, Chuck Lever III wrote: >> Hi- >> >>> On Jun 17, 2021, at 7:26 PM, trondmy@xxxxxxxxxx wrote: >>> >>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >>> >>> When flushing out the unstable file writes as part of a COMMIT >>> call, try >>> to perform most of of the data writes and waits outside the >>> semaphore. >>> >>> This means that if the client is sending the COMMIT as part of a >>> memory >>> reclaim operation, then it can continue performing I/O, with >>> contention >>> for the lock occurring only once the data sync is finished. >>> >>> Fixes: 5011af4c698a ("nfsd: Fix stable writes") >>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >> >> The good news is I've found no functional regressions. The bad >> news is I haven't seen any difference in performance. Is there >> a particular test that I can run to observe improvement? > > I'd expect that re-exported NFS would be the best test since fsync() is > a high latency operation when the page cache is loaded. You also want > to use a client with relatively limited memory so that it will try to > reclaim memory by pushing out dirty pages and doing COMMIT. I thought I was hitting those low-memory cases with direct I/O testing. <shrug> >> I wonder about adding a Fixes: tag for a change that the patch >> description describes as an optimization. > > I've occasionally hit OOM situations in the re-export case when the r/w > lock contention causes softerr failure to be serialised. > i.e. if the server is down, and you're essentially hoping that the nfsd > threads will give up and return EJUKEBOX/NFS4ERR_DELAY to the client, > then that lock ensures that threads fail one-by-one (grab lock, write, > timeout, release lock) instead of being able to all fail at once > (write, timeout). > >> >> >>> --- >>> fs/nfsd/vfs.c | 18 ++++++++++++++++-- >>> 1 file changed, 16 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c >>> index 15adf1f6ab21..46485c04740d 100644 >>> --- a/fs/nfsd/vfs.c >>> +++ b/fs/nfsd/vfs.c >>> @@ -1123,6 +1123,19 @@ nfsd_write(struct svc_rqst *rqstp, struct >>> svc_fh *fhp, loff_t offset, >>> } >>> >>> #ifdef CONFIG_NFSD_V3 >>> +static int >>> +nfsd_filemap_write_and_wait_range(struct nfsd_file *nf, loff_t >>> offset, >>> + loff_t end) >>> +{ >>> + struct address_space *mapping = nf->nf_file->f_mapping; >>> + int ret = filemap_fdatawrite_range(mapping, offset, end); >>> + >>> + if (ret) >>> + return ret; >>> + filemap_fdatawait_range_keep_errors(mapping, offset, end); >>> + return 0; >>> +} >>> + >>> /* >>> * Commit all pending writes to stable storage. >>> * >>> @@ -1153,10 +1166,11 @@ nfsd_commit(struct svc_rqst *rqstp, struct >>> svc_fh *fhp, >>> if (err) >>> goto out; >>> if (EX_ISSYNC(fhp->fh_export)) { >>> - int err2; >>> + int err2 = nfsd_filemap_write_and_wait_range(nf, >>> offset, end); >>> >>> down_write(&nf->nf_rwsem); >>> - err2 = vfs_fsync_range(nf->nf_file, offset, end, >>> 0); >>> + if (!err2) >>> + err2 = vfs_fsync_range(nf->nf_file, offset, >>> end, 0); >>> switch (err2) { >>> case 0: >>> nfsd_copy_boot_verifier(verf, >>> net_generic(nf->nf_net, >>> -- >>> 2.31.1 >>> >> >> -- >> Chuck Lever >> >> >> > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx -- Chuck Lever