Re: [PATCH RFC] nfs: ensure cached data is correct before using delegation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Trond,

On Fri, 13 Jun 2014, Trond Myklebust wrote:

> On Fri, Jun 13, 2014 at 2:18 PM, Scott Mayhew <smayhew@xxxxxxxxxx> wrote:
> > nfs_write_pageuptodate()  bypasses the cache_validity flags whenever we
> > have a delegation... but in order to do that we need to be sure our
> > cached data is correct to begin with.
> > ---
> >  fs/nfs/delegation.c | 1 +
> >  fs/nfs/inode.c      | 1 +
> >  fs/nfs/nfs4proc.c   | 5 +++--
> >  3 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> > index 5d8ccec..12f3eca 100644
> > --- a/fs/nfs/delegation.c
> > +++ b/fs/nfs/delegation.c
> > @@ -167,6 +167,7 @@ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred,
> >                         spin_unlock(&delegation->lock);
> >                         rcu_read_unlock();
> >                         nfs_inode_set_delegation(inode, cred, res);
> > +                       nfs_revalidate_mapping(inode, inode->i_mapping);
> 
> If you are reclaiming a delegation after a server reboot, then nobody
> is supposed to have changed the file.

Agreed.
> 
> >                 }
> >         } else {
> >                 rcu_read_unlock();
> > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > index c496f8a..95a9d21 100644
> > --- a/fs/nfs/inode.c
> > +++ b/fs/nfs/inode.c
> > @@ -1090,6 +1090,7 @@ int nfs_revalidate_mapping(struct inode *inode, struct address_space *mapping)
> >  out:
> >         return ret;
> >  }
> > +EXPORT_SYMBOL_GPL(nfs_revalidate_mapping);
> >
> >  static unsigned long nfs_wcc_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> >  {
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 285ad53..a538aac 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -1361,11 +1361,12 @@ nfs4_opendata_check_deleg(struct nfs4_opendata *data, struct nfs4_state *state)
> >                                    "returning a delegation for "
> >                                    "OPEN(CLAIM_DELEGATE_CUR)\n",
> >                                    clp->cl_hostname);
> > -       } else if ((delegation_flags & 1UL<<NFS_DELEGATION_NEED_RECLAIM) == 0)
> > +       } else if ((delegation_flags & 1UL<<NFS_DELEGATION_NEED_RECLAIM) == 0) {
> >                 nfs_inode_set_delegation(state->inode,
> >                                          data->owner->so_cred,
> >                                          &data->o_res);
> > -       else
> > +               nfs_revalidate_mapping(state->inode, state->inode->i_mapping);
> > +       } else
> >                 nfs_inode_reclaim_delegation(state->inode,
> >                                              data->owner->so_cred,
> >                                              &data->o_res);
> 
> I'd really prefer to fix this in the part of the code that is actually broken.
> 
> I agree that we should ignore the NFS_INO_REVAL_PAGECACHE flag if we
> have a delegation and the NFS_INO_REVAL_FORCED is unset. However is it
> right to ignore NFS_INO_INVALID_DATA?
> 

No, I don't think it's right to ignore NFS_INO_INVALID_DATA, and
originally I was testing a fix similar to this:

---8<---
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 3ee5af4..98ff061 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -934,12 +934,14 @@ static bool nfs_write_pageuptodate(struct page *page, struct inode *inode)
 
        if (nfs_have_delegated_attributes(inode))
                goto out;
-       if (nfsi->cache_validity & (NFS_INO_INVALID_DATA|NFS_INO_REVAL_PAGECACHE))
+       if (nfsi->cache_validity & NFS_INO_REVAL_PAGECACHE)
                return false;
        smp_rmb();
        if (test_bit(NFS_INO_INVALIDATING, &nfsi->flags))
                return false;
 out:
+       if (nfsi->cache_validity & NFS_INO_INVALID_DATA)
+               return false;
        return PageUptodate(page) != 0;
 }
---8<---


However,

1) it wasn't really keeping with the spirit of commit 8d197a56 (NFS:
Always trust the PageUptodate flag when we have a delegation), and

2) one of my test programs (used to test commit c7559663 (NFS: Allow
nfs_updatepage to extend a write under additional circumstances)))
started performing poorly again, doing tons of sub page-sized writes
intead of a handful of wsize'd writes.

I did some more digging and I think I see 2 areas that could be
improved. 

The first would be to clear NFS_INO_INVALID_DATA if we've just
truncated the inode to 0 bytes -- after all, if we've just unmapped
all the pages from the inode's address space then isn't our data
consisitent?:

---8<---
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index c496f8a..1078d06 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -584,6 +584,11 @@ void nfs_setattr_update_inode(struct inode *inode, struct iattr *attr)
        if ((attr->ia_valid & ATTR_SIZE) != 0) {
                nfs_inc_stats(inode, NFSIOS_SETATTRTRUNC);
                nfs_vmtruncate(inode, attr->ia_size);
+               if (attr->ia_size == 0) {
+                       spin_lock(&inode->i_lock);
+                       NFS_I(inode)->cache_validity &= ~NFS_INO_INVALID_DATA;
+                       spin_unlock(&inode->i_lock);
+               }
        }
 }
 EXPORT_SYMBOL_GPL(nfs_setattr_update_inode);
---8<---


The second thing I noticed is that we're constantly invalidating our
cache due to the change attribute changing on the server.  But if we
have a write delegation then the change attribute changing must be the
result of *our* changes, in which case we should be able to just silently
update the change attribute on our side without invalidating our caches:

---8<---
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 1078d06..932c999 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1568,15 +1568,17 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
        /* More cache consistency checks */
        if (fattr->valid & NFS_ATTR_FATTR_CHANGE) {
                if (inode->i_version != fattr->change_attr) {
-                       dprintk("NFS: change_attr change on server for file %s/%ld\n",
+                       if (!NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE)) { 
+                               dprintk("NFS: change_attr change on server for file %s/%ld\n",
                                        inode->i_sb->s_id, inode->i_ino);
-                       invalid |= NFS_INO_INVALID_ATTR
-                               | NFS_INO_INVALID_DATA
-                               | NFS_INO_INVALID_ACCESS
-                               | NFS_INO_INVALID_ACL
-                               | NFS_INO_REVAL_PAGECACHE;
-                       if (S_ISDIR(inode->i_mode))
-                               nfs_force_lookup_revalidate(inode);
+                               invalid |= NFS_INO_INVALID_ATTR
+                                       | NFS_INO_INVALID_DATA
+                                       | NFS_INO_INVALID_ACCESS
+                                       | NFS_INO_INVALID_ACL
+                                       | NFS_INO_REVAL_PAGECACHE;
+                               if (S_ISDIR(inode->i_mode))
+                                       nfs_force_lookup_revalidate(inode);
+                       }
                        inode->i_version = fattr->change_attr;
                }
        } else if (server->caps & NFS_CAP_CHANGE_ATTR)
---8<---

If you think these 3 changes look alright then I'll do some more testing
and then send the patches (but I'd rather not spend too much time
testing if you see an issue with the changes in the first place).

Thanks,
Scott
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux