Re: mds getattr locked again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2 Nov 2010, Henry C Chang wrote:
> Hi Sage,
> 
> I tested the latest unstable-backport plus
> 562d305287745e047370c49e78fd838f2a35899a reverted. However, the
> getattr still hanged (see the log file dmesg.ceph), but I think I
> found the problem (see the attached dmesg.ceph.henry).
> 
> Then, I made the following modification:
> 
> diff --git a/ceph/inode.c b/ceph/inode.c
> index 6444cac..afa0595 100644
> --- a/ceph/inode.c
> +++ b/ceph/inode.c
> @@ -1392,7 +1392,7 @@ static void ceph_invalidate_work(struct work_struct *work)
>         spin_lock(&inode->i_lock);
>         dout("invalidate_pages %p gen %d revoking %d\n", inode,
>              ci->i_rdcache_gen, ci->i_rdcache_revoking);
> -       if (ci->i_rdcache_gen == 0 ||
> +       if (inode->i_data.nrpages == 0 ||
>             ci->i_rdcache_revoking != ci->i_rdcache_gen) {
>                 BUG_ON(ci->i_rdcache_revoking > ci->i_rdcache_gen);
>                 /* nevermind! */
> 
> It seems working ok, but I don't know if this is the right way.
> Perhaps, not doing try_nonblocking_invalidate in handle_cap_grant is
> another way.

Aha, okay.  That's not the right fix, but points the way to the real 
problem.  Basically, the rdcache_gen/rdcache_revoking behavior is based on 
how things worked before we looked at i_data.nrpages.  We should never 
clear rdcache_gen (unless, perhaps, we're sure we don't hold the cap, but 
there's no point), and it being zero no longer has any special meaning.

Can you try the below patch?  I've done some light testing, but nothing 
serious yet.

Thanks-
sage




>From 86e44f09d6c37de381c549526f53b67344f34602 Mon Sep 17 00:00:00 2001
From: Sage Weil <sage@xxxxxxxxxxxx>
Date: Thu, 4 Nov 2010 11:05:05 -0700
Subject: [PATCH] ceph: fix rdcache_gen usage and invalidate

We used to use rdcache_gen to indicate whether we "might" have cached
pages.  Now we just look at the mapping to determine that.  However, some
old behavior remains from that transition.

First, rdcache_gen == 0 no longer means we have no pages.  That can happen
at any time (presumably when we carry FILE_CACHE).  We should not reset it
to zero, and we should not check that it is zero.

That means that the only purpose for rdcache_revoking is to resolve races
between new issues of FILE_CACHE and an async invalidate.  If they are
equal, we should invalidate.  On success, we decrement rdcache_revoking,
so that it is no longer equal to rdcache_gen.  Similarly, if we success
in doing a sync invalidate, set revoking = gen - 1.  (This is a small
optimization to avoid doing unnecessary invalidate work and does not
affect correctness.)

Signed-off-by: Sage Weil <sage@xxxxxxxxxxxx>
---
 fs/ceph/caps.c  |    4 ++--
 fs/ceph/inode.c |   16 +++++++---------
 fs/ceph/super.h |    4 +---
 3 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 6e0942f..36cad99 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1430,8 +1430,8 @@ static int try_nonblocking_invalidate(struct inode *inode)
 	    invalidating_gen == ci->i_rdcache_gen) {
 		/* success. */
 		dout("try_nonblocking_invalidate %p success\n", inode);
-		ci->i_rdcache_gen = 0;
-		ci->i_rdcache_revoking = 0;
+		/* save any racing async invalidate some trouble */
+		ci->i_rdcache_revoking = ci->i_rdcache_gen - 1;
 		return 0;
 	}
 	dout("try_nonblocking_invalidate %p failed\n", inode);
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 1d6a45b..1ca5226 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -1386,11 +1386,8 @@ static void ceph_invalidate_work(struct work_struct *work)
 	spin_lock(&inode->i_lock);
 	dout("invalidate_pages %p gen %d revoking %d\n", inode,
 	     ci->i_rdcache_gen, ci->i_rdcache_revoking);
-	if (ci->i_rdcache_gen == 0 ||
-	    ci->i_rdcache_revoking != ci->i_rdcache_gen) {
-		BUG_ON(ci->i_rdcache_revoking > ci->i_rdcache_gen);
+	if (ci->i_rdcache_revoking != ci->i_rdcache_gen) {
 		/* nevermind! */
-		ci->i_rdcache_revoking = 0;
 		spin_unlock(&inode->i_lock);
 		goto out;
 	}
@@ -1400,15 +1397,16 @@ static void ceph_invalidate_work(struct work_struct *work)
 	ceph_invalidate_nondirty_pages(inode->i_mapping);
 
 	spin_lock(&inode->i_lock);
-	if (orig_gen == ci->i_rdcache_gen) {
+	if (orig_gen == ci->i_rdcache_gen &&
+	    orig_gen == ci->i_rdcache_revoking) {
 		dout("invalidate_pages %p gen %d successful\n", inode,
 		     ci->i_rdcache_gen);
-		ci->i_rdcache_gen = 0;
-		ci->i_rdcache_revoking = 0;
+		ci->i_rdcache_revoking--;
 		check = 1;
 	} else {
-		dout("invalidate_pages %p gen %d raced, gen now %d\n",
-		     inode, orig_gen, ci->i_rdcache_gen);
+		dout("invalidate_pages %p gen %d raced, now %d revoking %d\n",
+		     inode, orig_gen, ci->i_rdcache_gen,
+		     ci->i_rdcache_revoking);
 	}
 	spin_unlock(&inode->i_lock);
 
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 1886294..7f01728 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -293,9 +293,7 @@ struct ceph_inode_info {
 	int i_rd_ref, i_rdcache_ref, i_wr_ref;
 	int i_wrbuffer_ref, i_wrbuffer_ref_head;
 	u32 i_shared_gen;       /* increment each time we get FILE_SHARED */
-	u32 i_rdcache_gen;      /* we increment this each time we get
-				   FILE_CACHE.  If it's non-zero, we
-				   _may_ have cached pages. */
+	u32 i_rdcache_gen;      /* incremented each time we get FILE_CACHE. */
 	u32 i_rdcache_revoking; /* RDCACHE gen to async invalidate, if any */
 
 	struct list_head i_unsafe_writes; /* uncommitted sync writes */
-- 
1.7.0

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux