Hi Neil and Bruce, Thank you for your comments. Now I have understood that this behavior is by design. > > With NFSv4 there is no atomicity guarantees relating to writes and > > changeid. > > There is provision for atomicity around directory operations, but not > > around data operations. So I feel like clients cannot always trust changeid in NFSv4. Should it be described in the spec? For example, the following section refers about the usage of changeid: https://datatracker.ietf.org/doc/html/draft-dnoveck-nfsv4-rfc5661bis#section-14.3.1 It says clients use change attribute " to ensure that the data for the OPENed file is still correctly reflected in the client's cache." But in fact, it could be wrong. Yuki Inoguchi > -----Original Message----- > From: 'bfields@xxxxxxxxxxxx' <bfields@xxxxxxxxxxxx> > Sent: Tuesday, January 4, 2022 1:21 AM > To: NeilBrown <neilb@xxxxxxx> > Cc: Inoguchi, Yuki/井ノ口 雄生 <inoguchi.yuki@xxxxxxxxxxx>; 'Matt Benjamin' > <mbenjami@xxxxxxxxxx>; 'Trond Myklebust' <trondmy@xxxxxxxxxxxxxxx>; > 'linux-nfs@xxxxxxxxxxxxxxx' <linux-nfs@xxxxxxxxxxxxxxx> > Subject: Re: client caching and locks > > On Tue, Dec 28, 2021 at 04:11:51PM +1100, NeilBrown wrote: > > This is due to an (arguable) weakness in the NFSv4 protocol. > > In NFSv3 the reply to the WRITE request had "wcc" data which would > > report change information before and after the write and, if present, it > > was required to be atomic. So, providing timestamps had a high > > resolution, the client0 would see change information from *before* the > > write from client1 completed. So it would know it needed to refresh > > after that write became visible. > > > > With NFSv4 there is no atomicity guarantees relating to writes and > > changeid. > > There is provision for atomicity around directory operations, but not > > around data operations. > > > > This means that if different clients access a file concurrently, then > > their cache can become incorrect. The only way to ensure uncorrupted > > data is to use locking for ALL reads and writes. The above 'od -i' does > > not perform a locked read, so can give incorrect data. > > If you got a whole-file lock before reading, then you should get correct > > data. > > You'd also have to get a whole-file write lock on every write, wouldn't > you, to prevent your own write from obscuring the change-attribute > update caused by a concurrent writer? > > > You could argue that this requirement (always lock if there is any risk) > > is by design, and so this aspect of the protocl is not a weakness. > > The spec discussion of byte-range locking and caching is here: > https://datatracker.ietf.org/doc/html/rfc7530#section-10.3.2 > > The nfs man page, under ac/noac, says "Using the noac option provides > greater cache coherence among NFS clients accessing the same files, > but it extracts a significant performance penalty. As such, judicious > use of file locking is encouraged instead. The DATA AND METADATA > COHERENCE section contains a detailed discussion of these trade-offs." > > That section does have a "Using file locks with NFS" subsection, but > that subsection doesn't actually discuss the interaction of file locks > with client caching. > > It's a confusing and under-documented area. > > --b.