> On Sep 11, 2023, at 4:54 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Mon, 2023-09-11 at 16:14 -0400, Chuck Lever wrote: >> On Mon, Sep 11, 2023 at 02:43:57PM -0400, trondmy@xxxxxxxxx wrote: >>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >>> >>> If fsync() is returning EAGAIN, then we can assume that the >>> filesystem >>> being exported is something like NFS with the 'softerr' mount >>> option >>> enabled, and that it is just asking us to replay the fsync() >>> operation >>> at a later date. >>> If we see an ESTALE, then ditto: the file is gone, so there is no >>> danger >>> of losing the error. >>> For those cases, do not reset the write verifier. >> >> Out of interest, what's the hazard in a write verifier change in >> these cases? There could be a slight performance penalty, I imagine, >> but how frequently does this happen? > > When re-exporting to NFSv4 clients, it should be less of a problem, > since any REMOVE will result in a sillyrenamed file that only > disappears once the file is closed. However with NFSv3 clients, that is > circumvented by the fact that the filecache closes the files when they > are inactive. We've seen this occur frequently with VMware vmdks: their > lock files appear to generate a lot of these phantom ESTALE writes. > > As for EAGAIN, I just pushed out a 2 patch client series that makes it > a lot more frequent when re-exporting NFSv4 with 'softerr'. > > Finally, it is worth noting that a write verifier change has a global > effect, causing retransmission by all clients of all uncommitted > unstable writes for all files, so is worth mitigating where possible. Good info. I've added some of this to the patch description. >> One more below. >> >> >>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >>> --- >>> fs/nfsd/vfs.c | 29 +++++++++++++++++++---------- >>> 1 file changed, 19 insertions(+), 10 deletions(-) >>> >>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c >>> index 98fa4fd0556d..31daf9f63572 100644 >>> --- a/fs/nfsd/vfs.c >>> +++ b/fs/nfsd/vfs.c >>> @@ -337,6 +337,20 @@ nfsd_lookup(struct svc_rqst *rqstp, struct >>> svc_fh *fhp, const char *name, >>> return err; >>> } >>> >>> +static void >>> +commit_reset_write_verifier(struct nfsd_net *nn, struct svc_rqst >>> *rqstp, >>> + int err) >>> +{ >>> + switch (err) { >>> + case -EAGAIN: >>> + case -ESTALE: >>> + break; >>> + default: >>> + nfsd_reset_write_verifier(nn); >>> + trace_nfsd_writeverf_reset(nn, rqstp, err); >>> + } >>> +} >>> + >>> /* >>> * Commit metadata changes to stable storage. >>> */ >>> @@ -647,8 +661,7 @@ __be32 nfsd4_clone_file_range(struct svc_rqst >>> *rqstp, >>> &nfsd4_get_cstate(rqstp)- >>>> current_fh, >>> dst_pos, >>> count, status); >>> - nfsd_reset_write_verifier(nn); >>> - trace_nfsd_writeverf_reset(nn, rqstp, >>> status); >>> + commit_reset_write_verifier(nn, rqstp, >>> status); >>> ret = nfserrno(status); >>> } >>> } >>> @@ -1170,8 +1183,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct >>> svc_fh *fhp, struct nfsd_file *nf, >>> host_err = vfs_iter_write(file, &iter, &pos, flags); >>> file_end_write(file); >>> if (host_err < 0) { >>> - nfsd_reset_write_verifier(nn); >>> - trace_nfsd_writeverf_reset(nn, rqstp, host_err); >>> + commit_reset_write_verifier(nn, rqstp, host_err); >> >> Can generic_file_write_iter() or its brethren return STALE or AGAIN >> before they get to the generic_write_sync() call ? > > The call to nfs_revalidate_file_size(), which can occur when you are > appending to the file (whether or not O_APPEND is set) could indeed > return ESTALE. > With the new patchset mentioned above, it could also return EAGAIN. Sounds like I should drop this hunk when applying this fix. >>> goto out_nfserr; >>> } >>> *cnt = host_err; >>> @@ -1183,10 +1195,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, >>> struct svc_fh *fhp, struct nfsd_file *nf, >>> >>> if (stable && use_wgather) { >>> host_err = wait_for_concurrent_writes(file); >>> - if (host_err < 0) { >>> - nfsd_reset_write_verifier(nn); >>> - trace_nfsd_writeverf_reset(nn, rqstp, >>> host_err); >>> - } >>> + if (host_err < 0) >>> + commit_reset_write_verifier(nn, rqstp, >>> host_err); >>> } >>> >>> out_nfserr: >>> @@ -1329,8 +1339,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct >>> svc_fh *fhp, struct nfsd_file *nf, >>> err = nfserr_notsupp; >>> break; >>> default: >>> - nfsd_reset_write_verifier(nn); >>> - trace_nfsd_writeverf_reset(nn, rqstp, >>> err2); >>> + commit_reset_write_verifier(nn, rqstp, >>> err2); >>> err = nfserrno(err2); >>> } >>> } else >>> -- >>> 2.41.0 >>> >> > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx -- Chuck Lever