On Mon, 2023-09-11 at 16:14 -0400, Chuck Lever wrote: > On Mon, Sep 11, 2023 at 02:43:57PM -0400, trondmy@xxxxxxxxx wrote: > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > > If fsync() is returning EAGAIN, then we can assume that the > > filesystem > > being exported is something like NFS with the 'softerr' mount > > option > > enabled, and that it is just asking us to replay the fsync() > > operation > > at a later date. > > If we see an ESTALE, then ditto: the file is gone, so there is no > > danger > > of losing the error. > > For those cases, do not reset the write verifier. > > Out of interest, what's the hazard in a write verifier change in > these cases? There could be a slight performance penalty, I imagine, > but how frequently does this happen? When re-exporting to NFSv4 clients, it should be less of a problem, since any REMOVE will result in a sillyrenamed file that only disappears once the file is closed. However with NFSv3 clients, that is circumvented by the fact that the filecache closes the files when they are inactive. We've seen this occur frequently with VMware vmdks: their lock files appear to generate a lot of these phantom ESTALE writes. As for EAGAIN, I just pushed out a 2 patch client series that makes it a lot more frequent when re-exporting NFSv4 with 'softerr'. Finally, it is worth noting that a write verifier change has a global effect, causing retransmission by all clients of all uncommitted unstable writes for all files, so is worth mitigating where possible. > > One more below. > > > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > --- > > fs/nfsd/vfs.c | 29 +++++++++++++++++++---------- > > 1 file changed, 19 insertions(+), 10 deletions(-) > > > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > > index 98fa4fd0556d..31daf9f63572 100644 > > --- a/fs/nfsd/vfs.c > > +++ b/fs/nfsd/vfs.c > > @@ -337,6 +337,20 @@ nfsd_lookup(struct svc_rqst *rqstp, struct > > svc_fh *fhp, const char *name, > > return err; > > } > > > > +static void > > +commit_reset_write_verifier(struct nfsd_net *nn, struct svc_rqst > > *rqstp, > > + int err) > > +{ > > + switch (err) { > > + case -EAGAIN: > > + case -ESTALE: > > + break; > > + default: > > + nfsd_reset_write_verifier(nn); > > + trace_nfsd_writeverf_reset(nn, rqstp, err); > > + } > > +} > > + > > /* > > * Commit metadata changes to stable storage. > > */ > > @@ -647,8 +661,7 @@ __be32 nfsd4_clone_file_range(struct svc_rqst > > *rqstp, > > &nfsd4_get_cstate(rqstp)- > > >current_fh, > > dst_pos, > > count, status); > > - nfsd_reset_write_verifier(nn); > > - trace_nfsd_writeverf_reset(nn, rqstp, > > status); > > + commit_reset_write_verifier(nn, rqstp, > > status); > > ret = nfserrno(status); > > } > > } > > @@ -1170,8 +1183,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct > > svc_fh *fhp, struct nfsd_file *nf, > > host_err = vfs_iter_write(file, &iter, &pos, flags); > > file_end_write(file); > > if (host_err < 0) { > > - nfsd_reset_write_verifier(nn); > > - trace_nfsd_writeverf_reset(nn, rqstp, host_err); > > + commit_reset_write_verifier(nn, rqstp, host_err); > > Can generic_file_write_iter() or its brethren return STALE or AGAIN > before they get to the generic_write_sync() call ? The call to nfs_revalidate_file_size(), which can occur when you are appending to the file (whether or not O_APPEND is set) could indeed return ESTALE. With the new patchset mentioned above, it could also return EAGAIN. > > > > goto out_nfserr; > > } > > *cnt = host_err; > > @@ -1183,10 +1195,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, > > struct svc_fh *fhp, struct nfsd_file *nf, > > > > if (stable && use_wgather) { > > host_err = wait_for_concurrent_writes(file); > > - if (host_err < 0) { > > - nfsd_reset_write_verifier(nn); > > - trace_nfsd_writeverf_reset(nn, rqstp, > > host_err); > > - } > > + if (host_err < 0) > > + commit_reset_write_verifier(nn, rqstp, > > host_err); > > } > > > > out_nfserr: > > @@ -1329,8 +1339,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct > > svc_fh *fhp, struct nfsd_file *nf, > > err = nfserr_notsupp; > > break; > > default: > > - nfsd_reset_write_verifier(nn); > > - trace_nfsd_writeverf_reset(nn, rqstp, > > err2); > > + commit_reset_write_verifier(nn, rqstp, > > err2); > > err = nfserrno(err2); > > } > > } else > > -- > > 2.41.0 > > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx