Re: Lookup revalidation for OPEN_CLAIM_FH

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Thu, 16 Jan 2020 14:35:25 +0000

On Thu, 2020-01-16 at 08:51 -0500, Benjamin Coddington wrote:
> Hi Trond,
> 
> I'd like to fix up lookup revalidation for v4.1+ when the client is
> using
> OPEN_CLAIM_FH.  The fixes a while back for Stan Hu's case do not seem
> to
> improve things for v4.1, and actually make the behavior a bit worse
> since we
> no longer pass through nfs_lookup_verify_inode(), which would catch
> the
> cases where nlink == 0.
> 
> Would you accept work to _always_ revalidate the dentry's parent for
> CLAIM_FH?  Alternatively, it seems that CLAIM_NULL would be
> preferable for
> this case, though I don't know how the client would know when to
> decide
> between them.
>
> Here's a simple reproducer for convenience, I think we've already all
> agreed
> that the behavior we want is for the second open by `cat` to reflect
> the
> results of the move on the server, or at least eventually later opens
> would
> revalidate the dentry:
>
> #!/bin/bash
> 
> set -o xtrace
> vers=4.1
> 
> exportfs -ua
> exportfs -o rw,sec=sys,no_root_squash *:/exports
> 
> mkdir /mnt/localhost || true
> 
> rm -f /exports/file{1,2}
> 
> echo this is file 1 > /exports/file1
> echo this is file 2 > /exports/file2
> 
> mount -t nfs -ov$vers,sec=sys localhost:/exports /mnt/localhost
> 
> tail -f /mnt/localhost/file1 &
> sleep 1
> 
> # this is file 1
> cat /mnt/localhost/file1
> 
> # overwrite the file on the server:
> mv -f /exports/file2 /exports/file1
> 
> # this is file 2
> cat /mnt/localhost/file1
> 
> killall tail
> # this is file 2
> cat /mnt/localhost/file1
> umount /mnt/localhost
> 
> 
> Switching the $vers variable between v4.0 and v4.1 in this script
> shows the
> difference in behavior.
> 

If somebody needs stronger lookup cache revalidation, then that's what
they have the 'lookupcache=none' mount option for. We have these
'lookupcache' mount options in order to allow users to tailor the
caching behaviour (on a per-mount basis) should the default behaviour
be insufficiently strict.

Since your testcase doesn't use that mount option, I don't see what it
is proving other than what we already know about the default lookup
caching: namely that it sacrifices some accuracy in the interest of
file open performance.

By the way, NFSv4.1 will actually handle this situation better than
NFSv3, since the stateful open ensures that even if the file that won
the race was deleted by the other client, then the server will preserve
that file and its contents until our client calls close().

In conclusion:

 * With a default mount option of 'lookupcache=all', we don't promise
   100% accuracy in the face of 3rd party client changes to the
   namespace. We only promise that we will eventually pick up the
   changes. If we're failing to do that, then let's look at why, but
   'lookupcache=all' is not guaranteed to immediately find namespace
   changes.

 * With 'lookupcache=pos', we promise that the client will ignore
   negative  cached dentries, and hence will always find a file that
   was created using exclusive create on the server. However I'd expect
   that too to fail your test above.

 * If it fails with 'lookupcache=none', then I would agree that we have
   a problem.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx