Re: client caching and locks

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Thu, 18 Jun 2020 14:29:42 +0000

On Thu, 2020-06-18 at 09:54 +0000, inoguchi.yuki@xxxxxxxxxxx wrote:
> > What does the client do to its cache when it writes to a locked
> > range?
> > 
> > The RFC:
> > 
> > 	https://tools.ietf.org/html/rfc7530#section-10.3.2
> > 
> > seems to apply that you should get something like local-filesystem
> > semantics if you write-lock any range that you write to and read-
> > lock
> > any range that you read from.
> > 
> > But I see a report that when applications write to non-overlapping
> > ranges (while taking locks over those ranges), they don't see each
> > other's updates.
> > 
> > I think for simultaneous non-overlapping writes to work that way,
> > the
> > client would need to invalidate its cache on unlock (except for the
> > locked range).  But i can't tell what the client's designed to do.
> 
> Simultaneous non-overlapping WRITEs is not taken into consideration
> in RFC7530.
> I personally think it is not necessary to deal with this case by
> modifying the kernel because
> the application on the client can be implemented to avoid it.
> 
> Serialization of the simultaneous operations may be one of the ways.
> Just before the write operation, each client locks and reads the
> overlapped range of data
> instead of obtaining a lock in their own non-overlapping range.
> They can reflect updates from other clients in this case.
> 
> Yuki Inoguchi
> 
> > --b.

See the function 'fs/nfs/file.c:do_setlk()'. We flush dirty file data
both before and after taking the byte range lock. After taking the
lock, we force a revalidation of the data before returning control to
the application (unless there is a delegation that allows us to cache
more aggressively).

In addition, if you look at fs/nfs/file.c:do_unlk() you'll note that we
force a flush of all dirty file data before releasing the lock.

Finally, note that we turn off assumptions of close-to-open caching
semantics when we detect that the application is using locking, and we
turn off optimisations such as assuming we can extend writes to page
boundaries when the page is marked as being up to date.

IOW: if all the clients are running Linux, then the thread that took
the lock should see 100% up to date data in the locked range. I believe
most (if not all) non-Linux clients use similar semantics when
taking/releasing byte range locks, so they too should be fine.

The only 2 issues I can think of offhand that might blow things up are:

   1. The client thinks it holds a delegation when it does not (e.g.
      because the delegation was revoked) causing it to assume it can
      cache aggressively.
   2. The change attribute on the server implementation is based on a
      ctime with crappy resolution that causes the client to believe the
      data has not changed on the server even though it has (a.k.a. 'ext3
      syndrome').

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx