Re: cto changes for v4 atomic open

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Wed, 4 Aug 2021 01:38:23 +0000

On Wed, 2021-08-04 at 11:30 +1000, NeilBrown wrote:
> On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > On Wed, 2021-08-04 at 10:57 +1000, NeilBrown wrote:
> > > On Wed, 04 Aug 2021, Trond Myklebust wrote:
> > > > 
> > > > No. What you propose is to optimise for a fringe case, which we
> > > > cannot
> > > > guarantee will work anyway. I'd much rather optimise for the
> > > > common
> > > > case, which is the only case with predictable semantics.
> > > > 
> > > 
> > > "predictable"??
> > > 
> > > As I understand it (I haven't examined the code) the current
> > > semantics
> > > includes:
> > >  If a file is open for read, some other client changed the file,
> > > and
> > > the
> > >   file is then opened, then the second open might see new data,
> > > or
> > > might
> > >   see old data, depending on whether the requested data is still
> > > in
> > >   cache or not.
> > > 
> > > I find this to be less predictable than the easy-to-understand
> > > semantics
> > > that Bruce has quoted:
> > >   - revalidate on every open, flush on every close
> > > 
> > > I'm suggesting we optimize for fringe cases, I'm suggesting we
> > > provide
> > > semantics that are simple, documentated, and predictable.
> > > 
> > 
> > "Predictable" how?
> > 
> > This is cached I/O. By definition, it is allowed to do things like
> > readahead, writeback caching, metadata caching. What you're
> > proposing
> > is to optimise for a case that breaks all of the above. What's the
> > point? We might just as well throw in the towel and just make
> > uncached
> > I/O and 'noac' mounts the default.
> 
> How are readahead, and other caching broken? Indeed, how are they
> even
> predictable? Caching is almost by definition a best-effort.  Read
> requests may, or may not, be served from read-ahead data.  Write
> maybe
> written back sooner or later.  Various system-load factors can affect
> this.   You can never predict that a cache *will* be used.
> 

Caching not a "best effort" attempt. The client is expected to provide
a perfect reproduction of the data stored on the server in the case
where there is no close-to-open violation.
In the case where there are close-to-open violations then there are two
cases:

   1. The user cares, and is using uncached I/O together with a
      synchronisation protocol in order to mitigate any data+metadata
      discrepancies between the client and server.
   2. The user doesn't care, and we're in the standard buffered I/O
      case.

Why are you and Bruce insisting that case (2) needs to be treated as
special?

> "revalidate on every open, flush on every close" (in the absence of
> delegations of course) provides access to the only element of cache
> behaviour that *can* be predictable: the times when it *wont* be
> used.
> 

No. ...and the very fact you had to qualify the above with "in the
absence of delegations" proves my point.

-- 
Trond Myklebust Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx