On Mon, Sep 17, 2018 at 01:57:17PM -0700, Stan Hu wrote: > On both kernels in Ubuntu 16.04 (4.4.0-130) and CentOS 7.3 > (3.10.0-862.11.6.el7.x86_64) with NFS 4.1, I'm seeing an issue where > stale data is shown if a file remains open on one machine, and the > file is overwritten via a rename() on another. Here's my test: > > 1. On node A, create two different files on a shared NFS mount: > "test1.txt" and "test2.txt". > 2. On node B, continuously show the contents of the first file: "while > true; do cat test1.txt; done" > 3. On node B, run a process that keeps "test1.txt" open. For example, > with Python, run: > f = open('/nfs-mount/test.txt', 'r') > 4. Rename test2.txt via "mv -f test2.txt test1.txt" > > On node B, I see the contents of the original test1.txt indefinitely, > even after I disabled attribute caching and the lookup cache. I can > make the while loop in step 2 show the new content if I perform one of > these actions: > > 1. Run "ls /nfs-mount" > 2. Close the open file in step 3 > > I suspect the first causes the readdir cache revalidation to happen. > > Is this intended behavior, or is there a better way to achieve > consistency here without performing one of these actions? Sounds like a bug to me, but I'm not sure where. What filesystem are you exporting? How much time do you think passes between steps 1 and 4? (I *think* it's possible you could hit a bug caused by low ctime granularity if you could get from step 1 to step 4 in less than a millisecond.) Those kernel versions--are those the client (node A and B) versions, or the server versions? > Note that with an Isilon NFS server, instead of seeing stale content, > I see "Stale file handle" errors indefinitely unless I perform one of > the corrective steps. You see "stale file handle" errors from the "cat test1.txt"? That's also weird. --b.