Re: cto changes for v4 atomic open

Rick Macklem <rmacklem@xxxxxxxxxxx> · Wed, 4 Aug 2021 15:42:42 +0000

Patrick Goetz wrote:
[stuff snipped]
>So, I have a naive question. When a client is writing to cache, why
>wouldn't it be possible to send an alert to the server indicating that
>the file is being changed. The server would keep track of such files
>(client cached, updated) and act accordingly; i.e. sending a request to
>the client to flush the cache for that file if another client is asking
>to open the file? The process could be bookended by the client alerting
>the server when the cached version has been fully synchronized with the
>copy on the server so that the server wouldn't serve that file until the
>synchronization is complete. The only problem I can see with this is the
>client crashing or disconnecting before the file is fully written to the
>server, but then some timeout condition could be set.
Well, I wouldn't call this a naive question.

There is no notification mechanism defined for any version of NFS.

However, although it isn't exactly a notification per se, in NFSv4
a client can exclusively lock a byte range (all bytes if desired).
The limitation is that all clients have to "play the game" and
acquire byte range locks before doing I/O on the file.

I've always thought close-to-open consistency was sketchy
at best, and clients should use byte range locks if they care
about getting up-to-date file data for cases where other clients
might be writing the file.

The FreeBSD client only implements close-to-open consistency
approximately. It uses cached attributes (which may not be up to
date) to re-validate cached data upon open syscalls and doesn't
worry about mtime clock resolution for NFSv3.
--> As such, the client will see data written by another client within
      a bounded time, but not necessarily immediately after the writer
      closes the file on another client.
When I work on the FreeBSD NFS client, it always seems to come
down to "correctness vs good performance via caching" or
"how incorrect can I get away with" if you prefer.

rick, who chooses to not have an opinion w.r.t. how the Linux
        NFS client should handle close-to-open consistency
ps: I just told Bruce I wasn't going to post, but...

>>> Matt
>>>
>>> On Tue, Aug 3, 2021 at 5:36 PM bfields@xxxxxxxxxxxx
>>> <bfields@xxxxxxxxxxxx> wrote:
>>>>
>>>> On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
>>>>> On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
>>>>>> On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
>>>>>> wrote:
>>>>>>> On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
>>>>>>> wrote:
>>>>>>>> I have some folks unhappy about behavior changes after:
>>>>>>>> 479219218fbe
>>>>>>>> NFS:
>>>>>>>> Optimise away the close-to-open GETATTR when we have
>>>>>>>> NFSv4 OPEN
>>>>>>>>
>>>>>>>> Before this change, a client holding a RO open would
>>>>>>>> invalidate
>>>>>>>> the
>>>>>>>> pagecache when doing a second RW open.
>>>>>>>>
>>>>>>>> Now the client doesn't invalidate the pagecache, though
>>>>>>>> technically
>>>>>>>> it could
>>>>>>>> because we see a changeattr update on the RW OPEN
>>>>>>>> response.
>>>>>>>>
>>>>>>>> I feel this is a grey area in CTO if we're already
>>>>>>>> holding an
>>>>>>>> open.
>>>>>>>> Do we
>>>>>>>> know how the client ought to behave in this case?  Should
>>>>>>>> the
>>>>>>>> client's open
>>>>>>>> upgrade to RW invalidate the pagecache?
>>>>>>>>
>>>>>>>
>>>>>>> It's not a "grey area in close-to-open" at all. It is very
>>>>>>> cut and
>>>>>>> dried.
>>>>>>>
>>>>>>> If you need to invalidate your page cache while the file is
>>>>>>> open,
>>>>>>> then
>>>>>>> by definition you are in a situation where there is a write
>>>>>>> by
>>>>>>> another
>>>>>>> client going on while you are reading. You're clearly not
>>>>>>> doing
>>>>>>> close-
>>>>>>> to-open.
>>>>>>
>>>>>> Documentation is really unclear about this case.  Every
>>>>>> definition of
>>>>>> close-to-open that I've seen says that it requires a cache
>>>>>> consistency
>>>>>> check on every application open.  I've never seen one that
>>>>>> says "on
>>>>>> every open that doesn't overlap with an already-existing open
>>>>>> on that
>>>>>> client".
>>>>>>
>>>>>> They *usually* also preface that by saying that this is
>>>>>> motivated by
>>>>>> the
>>>>>> use case where opens don't overlap.  But it's never made
>>>>>> clear that
>>>>>> that's part of the definition.
>>>>>>
>>>>>
>>>>> I'm not following your logic.
>>>>
>>>> It's just a question of what every source I can find says close-
>>>> to-open
>>>> means.  E.g., NFS Illustrated, p. 248, "Close-to-open consistency
>>>> provides a guarantee of cache consistency at the level of file
>>>> opens and
>>>> closes.  When a file is closed by an application, the client
>>>> flushes any
>>>> cached changs to the server.  When a file is opened, the client
>>>> ignores
>>>> any cache time remaining (if the file data are cached) and makes
>>>> an
>>>> explicit GETATTR call to the server to check the file
>>>> modification
>>>> time."
>>>>
>>>>> The close-to-open model assumes that the file is only being
>>>>> modified by
>>>>> one client at a time and it assumes that file contents may be
>>>>> cached
>>>>> while an application is holding it open.
>>>>> The point checks exist in order to detect if the file is being
>>>>> changed
>>>>> when the file is not open.
>>>>>
>>>>> Linux does not have a per-application cache. It has a page
>>>>> cache that
>>>>> is shared among all applications. It is impossible for two
>>>>> applications
>>>>> to open the same file using buffered I/O, and yet see different
>>>>> contents.
>>>>
>>>> Right, so based on the descriptions like the one above, I would
>>>> have
>>>> expected both applications to see new data at that point.
>>>>
>>>> Maybe that's not practical to implement.  It'd be nice at least
>>>> if that
>>>> was explicit in the documentation.
>>>>
>>>> --b.
>>>>
>>>
>>>
>>> --
>>>
>>> Matt Benjamin
>>> Red Hat, Inc.
>>> 315 West Huron Street, Suite 140A
>>> Ann Arbor, Michigan 48103
>>>
>>> http://www.redhat.com/en/technologies/storage
>>>
>>> tel.  734-821-5101
>>> fax.  734-769-8938
>>> cel.  734-216-5309
>>
>>
>>
>