Re: cto changes for v4 atomic open

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/3/21 9:10 PM, Trond Myklebust wrote:


On Tue, 2021-08-03 at 21:51 -0400, Matt Benjamin wrote:
(who have performed an open)

On Tue, Aug 3, 2021 at 9:43 PM Matt Benjamin <mbenjami@xxxxxxxxxx>
wrote:

I think it is how close-to-open has been traditionally understood.
I
do not believe that close-to-open in any way implies a single
writer,
rather it sets the consistency expectation for all readers.


OK. I'll bite, despite the obvious troll-bait...


close-to-open implies a single writer because it is impossible to
guarantee ordering semantics in RPC. You could, in theory, do so by
serialising on the client, but none of us do that because we care about
performance.

If you don't serialise between clients, then it is trivial (and I'm
seriously tired of people who whine about this) to reproduce reads to
file areas that have not been fully synced to the server, despite
having data on the client that is writing. i.e. the reader sees holes
that never existed on the client that wrote the data.
The reason is that the writes got re-ordered en route to the server,
and so reads to the areas that have not yet been filled are showing up
as holes.

So, no, the close-to-open semantics definitely apply to both readers
and writers.


So, I have a naive question. When a client is writing to cache, why wouldn't it be possible to send an alert to the server indicating that the file is being changed. The server would keep track of such files (client cached, updated) and act accordingly; i.e. sending a request to the client to flush the cache for that file if another client is asking to open the file? The process could be bookended by the client alerting the server when the cached version has been fully synchronized with the copy on the server so that the server wouldn't serve that file until the synchronization is complete. The only problem I can see with this is the client crashing or disconnecting before the file is fully written to the server, but then some timeout condition could be set.



Matt

On Tue, Aug 3, 2021 at 5:36 PM bfields@xxxxxxxxxxxx
<bfields@xxxxxxxxxxxx> wrote:

On Tue, Aug 03, 2021 at 09:07:11PM +0000, Trond Myklebust wrote:
On Tue, 2021-08-03 at 16:30 -0400, J. Bruce Fields wrote:
On Fri, Jul 30, 2021 at 02:48:41PM +0000, Trond Myklebust
wrote:
On Fri, 2021-07-30 at 09:25 -0400, Benjamin Coddington
wrote:
I have some folks unhappy about behavior changes after:
479219218fbe
NFS:
Optimise away the close-to-open GETATTR when we have
NFSv4 OPEN

Before this change, a client holding a RO open would
invalidate
the
pagecache when doing a second RW open.

Now the client doesn't invalidate the pagecache, though
technically
it could
because we see a changeattr update on the RW OPEN
response.

I feel this is a grey area in CTO if we're already
holding an
open.
Do we
know how the client ought to behave in this case?  Should
the
client's open
upgrade to RW invalidate the pagecache?


It's not a "grey area in close-to-open" at all. It is very
cut and
dried.

If you need to invalidate your page cache while the file is
open,
then
by definition you are in a situation where there is a write
by
another
client going on while you are reading. You're clearly not
doing
close-
to-open.

Documentation is really unclear about this case.  Every
definition of
close-to-open that I've seen says that it requires a cache
consistency
check on every application open.  I've never seen one that
says "on
every open that doesn't overlap with an already-existing open
on that
client".

They *usually* also preface that by saying that this is
motivated by
the
use case where opens don't overlap.  But it's never made
clear that
that's part of the definition.


I'm not following your logic.

It's just a question of what every source I can find says close-
to-open
means.  E.g., NFS Illustrated, p. 248, "Close-to-open consistency
provides a guarantee of cache consistency at the level of file
opens and
closes.  When a file is closed by an application, the client
flushes any
cached changs to the server.  When a file is opened, the client
ignores
any cache time remaining (if the file data are cached) and makes
an
explicit GETATTR call to the server to check the file
modification
time."

The close-to-open model assumes that the file is only being
modified by
one client at a time and it assumes that file contents may be
cached
while an application is holding it open.
The point checks exist in order to detect if the file is being
changed
when the file is not open.

Linux does not have a per-application cache. It has a page
cache that
is shared among all applications. It is impossible for two
applications
to open the same file using buffered I/O, and yet see different
contents.

Right, so based on the descriptions like the one above, I would
have
expected both applications to see new data at that point.

Maybe that's not practical to implement.  It'd be nice at least
if that
was explicit in the documentation.

--b.



--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux