On Thu, 2020-01-16 at 10:13 -0500, Benjamin Coddington wrote: > On 16 Jan 2020, at 9:35, Trond Myklebust wrote: > > > On Thu, 2020-01-16 at 08:51 -0500, Benjamin Coddington wrote: > > > Hi Trond, > > > > > > I'd like to fix up lookup revalidation for v4.1+ when the client > > > is > > > using > > > OPEN_CLAIM_FH. The fixes a while back for Stan Hu's case do not > > > seem > > > to > > > improve things for v4.1, and actually make the behavior a bit > > > worse > > > since we > > > no longer pass through nfs_lookup_verify_inode(), which would > > > catch > > > the > > > cases where nlink == 0. > > > > > > Would you accept work to _always_ revalidate the dentry's parent > > > for > > > CLAIM_FH? Alternatively, it seems that CLAIM_NULL would be > > > preferable for > > > this case, though I don't know how the client would know when to > > > decide > > > between them. > > > > > > Here's a simple reproducer for convenience, I think we've already > > > all > > > agreed > > > that the behavior we want is for the second open by `cat` to > > > reflect > > > the > > > results of the move on the server, or at least eventually later > > > opens > > > would > > > revalidate the dentry: > > > > > > #!/bin/bash > > > > > > set -o xtrace > > > vers=4.1 > > > > > > exportfs -ua > > > exportfs -o rw,sec=sys,no_root_squash *:/exports > > > > > > mkdir /mnt/localhost || true > > > > > > rm -f /exports/file{1,2} > > > > > > echo this is file 1 > /exports/file1 > > > echo this is file 2 > /exports/file2 > > > > > > mount -t nfs -ov$vers,sec=sys localhost:/exports /mnt/localhost > > > > > > tail -f /mnt/localhost/file1 & > > > sleep 1 > > > > > > # this is file 1 > > > cat /mnt/localhost/file1 > > > > > > # overwrite the file on the server: > > > mv -f /exports/file2 /exports/file1 > > > > > > # this is file 2 > > > cat /mnt/localhost/file1 > > > > > > killall tail > > > # this is file 2 > > > cat /mnt/localhost/file1 > > > umount /mnt/localhost > > > > > > > > > Switching the $vers variable between v4.0 and v4.1 in this script > > > shows the > > > difference in behavior. > > > > > > > If somebody needs stronger lookup cache revalidation, then that's > > what > > they have the 'lookupcache=none' mount option for. We have these > > 'lookupcache' mount options in order to allow users to tailor the > > caching behaviour (on a per-mount basis) should the default > > behaviour > > be insufficiently strict. > > > > Since your testcase doesn't use that mount option, I don't see what > > it > > is proving other than what we already know about the default lookup > > caching: namely that it sacrifices some accuracy in the interest of > > file open performance. > > Thanks for the look. The testcase only provides a comparison of > different > behavior between v4.0 and v4.1, which is due to our use of CLAIM_NULL > vs > CLAIM_FH. > > Indeed, setting lookupcache=none give real-time updates. With > default > lookupcache, after the directory attributes time out, the client will > likewise get the namespace right. So, this isn't a major problem, > but we do > have some QE folks that are unhappy about it. > > Can we improve things a bit for v4.1 without sacrificing > performance? I > can't think of a reason to not go back to CLAIM_NULL in > nfs4_file_open(). > Maybe it is a bit more work on the server to have to do one extra > lookup per > open, but we'll end up with the right file each time. > > It makes sense to keep CLAIM_FH for recovery, but why keep it for > regular > opens? > nfs_file_open() is completely the wrong place to perform a lookup. Its purpose in the VFS is to allow the filesystem to set up state *after* we've already looked up the dentry, revalidated it and therefore decided which file to open. The NFSv4.0 behaviour of performing a new lookup is actually the aberration here, and is due to the fact that it does not have an open- by-filehandle operation, so we have no alternative. As I said, if you want stronger semantics, there are lookupcache mount options that allow you to tune things. I therefore see no valid reason to change the existing behaviour, which also matches that of older NFS versions (i.e. v3 and v2). -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx