Re: [PATCH] nfs: don't extend writes to cover entire page if pagecache is invalid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2012-12-10 at 11:05 -0500, Jeff Layton wrote:
> On Mon, 10 Dec 2012 14:42:23 +0000
> "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote:
> 
> > On Mon, 2012-12-10 at 09:25 -0500, Jeff Layton wrote:
> > > Jian reported that the following sequence would leave "testfile" with
> > > corrupt data:
> > > 
> > >     # mount localhost:/export /mnt/nfs/ -o vers=3
> > >     # echo abc > /mnt/nfs/testfile; echo def >> /export/testfile; echo ghi >> /mnt/nfs/testfile
> > >     # cat -v /export/testfile
> > >     abc
> > >     ^@^@^@^@ghi
> > > 
> > > While there's no locking involved here, the operations are serialized,
> > > so CTO should prevent corruption.
> > > 
> > > The first write to the file is fine and writes 4 bytes. The file is then
> > > extended on the server. When it's reopened a GETATTR is issued and the
> > > size change is noticed. This causes NFS_INO_INVALID_DATA to be set on
> > > the file. Because the file is opened for write only,
> > > nfs_want_read_modify_write() returns 0 to nfs_write_begin().
> > > nfs_updatepage then calls nfs_write_pageuptodate() to see if it should
> > > extend the nfs_page to cover the whole page. NFS_INO_INVALID_DATA is
> > > still set on the file at that point, but that flag is ignored and
> > > nfs_pageuptodate erroneously extends the write to cover the whole page,
> > > with the write done on the server side filled in with zeroes.
> > > 
> > > This patch just has that function check for NFS_INO_INVALID_DATA in
> > > addition to NFS_INO_REVAL_PAGECACHE. This fixes the bug, but looking
> > > over the code, I wonder if we might have a similar bug in
> > > nfs_revalidate_size(). The difference between those two flags is very
> > > subtle, so it seems like we ought to be checking for
> > > NFS_INO_INVALID_DATA in most of the places that we look for
> > > NFS_INO_REVAL_PAGECACHE.
> > > 
> > > I believe this is regression introduced by commit 8d197a568. The code
> > > did check for NFS_INO_INVALID_DATA prior to that patch.
> > > 
> > > Original bug report is here:
> > > 
> > >     https://bugzilla.redhat.com/show_bug.cgi?id=885743
> > 
> > Hi Jeff,
> > 
> > The point of NFS_INO_REVAL_PAGECACHE is to be able to distinguish
> > between situations where the file size or change attribute is in doubt
> > (and so the page cache validity is in doubt), and situations where other
> > attributes are in doubt (such as ACCESS caches, file owner, nlinks,...).
> > This is why we're only checking for NFS_INO_REVAL_PAGECACHE and not
> > NFS_INO_INVALID_DATA in the functions you mention above.
> > 
> 
> Ok, now I'm really confused... I thought that:
> 
> NFS_INO_INVALID_DATA == pagecache data was known to be wrong
> NFS_INO_REVAL_PAGECACHE == pagecache data is questionable

There is no "known to be wrong" state. Both mean "needs revalidation".

> ...but if I understand what you're saying above, the "data" that
> NFS_INO_INVALID_DATA refers to is in peripheral caches, like the ACCESS
> cache? I guess then I don't quite understand what NFS_INO_INVALID_ATTR
> and NFS_INO_INVALID_ACCESS are for. Aren't they supposed to indicate that
> the attrs and access caches are invalid?

NFS_INO_INVALID_ATTR basically means that some aspect of the file
metadata needs revalidation, and so a call to nfs_revalidate_inode()
will always result in a GETATTR rpc call.

NFS_INO_INVALID_ACCESS means that the file access cache needs
revalidation. A call to nfs_do_access() will always result in an ACCESS
rpc call.

NFS_INO_REVAL_PAGECACHE means that the file page cache data needs
revalidation. A call to nfs_revalidate_file_size() or
nfs_revalidate_mapping() will result in a revalidation of the page
cache. In practice that means a GETATTR rpc call, but that may change if
future versions of NFS allow for different methods of revalidating the
page cache; consider, for instance, the byte-range delegation RFC that
Bruce and I published a few years back.

> > So the fix here should be to set NFS_INO_REVAL_PAGECACHE when the change
> > attribute and/or size change is noticed. The only function I can see
> > that appears to get this wrong is nfs_wcc_update_inode(). Does fixing
> > that lay the bug to rest?
> > 
> 
> No, that doesn't fix it. There's another place that misses setting that
> flag too in nfs_update_inode() where the size changes. I added it there
> too and that also didn't fix it. Here's why (I think):
> 
> When the size change is noticed via the CTO GETATTR call, then my
> patched kernel now sets NFS_INO_REVAL_PAGECACHE in nfs_update_inode.
> Now cache_validity is set to 0x2a. The kernel then makes an ACCESS call
> that also ultimately ends up in nfs_update_inode.
> 
> nfs_update_inode saves off the cache_validity flags and clears all of
> them except for NFS_INO_INVALID_DATA. As best I can tell, the
> saved_cache_validity is then ending up discarded and on exit the
> cache_validity is ending up set to just NFS_INO_INVALID_DATA.
> 
> It seems like we ought to be restoring the save_cache_validity flags
> before exiting that function, but I confess that I don't quite grasp
> the logic in nfs_update_inode. Thoughts?

If the ACCESS call isn't revalidating the file size, then it should not
be clearing NFS_INO_REVAL_PAGECACHE; it does this by restoring that bit
from the "save_cache_validity" variable and storing it in "invalid". As
far as I can tell, that code is correct in upstream.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux