On Mon, 2023-02-13 at 16:49 -0800, Rick Macklem wrote: > On Mon, Feb 13, 2023 at 1:14 PM Jeff Layton <jlayton@xxxxxxxxxx> > wrote: > > > > CAUTION: This email originated from outside of the University of > > Guelph. Do not click links or open attachments unless you recognize > > the sender and know the content is safe. If in doubt, forward > > suspicious emails to IThelp@xxxxxxxxxxx > > > > > > The write verifier exists to tell the client when the server may > > have > > forgotten some unstable writes. The typical way that this happens > > is if > > the server crashes, but we've also extended nfsd to change it when > > there > > are writeback errors as well. > > > > The way it works today though, we call something like vfs_fsync > > (e.g. > > for a COMMIT call) and if we get back an error, we'll reset the > > write > > verifier. > > > > This is non-optimal for a couple of reasons: > > > > 1/ There could be significant delay between an error being > > recorded and the reset. It would be ideal if the write verifier > > were to > > change as soon as the error was recorded. > > > > 2/ It's a bit of a waste, in that if we get a writeback error on a > > single inode, we'll end up resetting the write verifier for > > everything, > > even on inodes that may be fine (e.g. on a completely separate fs). > > > Here's the snippet from RFC8881: > The final portion of the result is the field writeverf. This > field > is the write verifier and is a cookie that the client can use to > determine whether a server has changed instance state (e.g., > server > restart) between a call to WRITE and a subsequent call to either > WRITE or COMMIT. This cookie MUST be unchanged during a single > instance of the NFSv4.1 server and MUST be unique between > instances > of the NFSv4.1 server. If the cookie changes, then the client > MUST > assume that any data written with an UNSTABLE4 value for committed > and an old writeverf in the reply has been lost and will need to > be > recovered. > > I've always interpreted the writeverf as "per-server" and not "per- > file". > Although I'll admit the above does not make that crystal clear, it > does make > it clear that the writeverf applies to a "server instance" and not a > file or > file system on the server. > > The FreeBSD client assumes it is "per-server" and re-writes all > uncommitted > writes for the server, not just ones for the file (or file system) > the > writeverf is > replied with. (I vaguely recall Solaris does the same?) > > At the very least, I think you should run this past the IETF working > group > (nfsv4@xxxxxxxx) to see what they say w.r.t. the writeverf being > "per-file" vs > "per-server". > As I recall, we've already had this discussion on the IETF NFSv4 working group mailing list: https://mailarchive.ietf.org/arch/msg/nfsv4/99Ow2muMylXKWd9lzi9_BX2LJDY/ That's why I kept it a global in the first place. Now note that RFC8881 does also clarify in Section 18.3.3 that: The server must vary the value of the write verifier at each server event or instantiation that may lead to a loss of uncommitted data. Most commonly this occurs when the server is restarted; however, other events at the server may result in uncommitted data loss as well. So I feel it is quite OK to use the verifier the way we do now in order to signify that a fatal write error has occurred and that clients must resend any data that was uncommitted. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx