Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Trond Myklebust wrote:
On Mon, 2009-01-05 at 13:33 -0600, Tom Tucker wrote:
Trond Myklebust wrote:
On Mon, 2009-01-05 at 11:04 -0600, Tom Tucker wrote:
Trond Myklebust wrote:
On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
[...snip...]

I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
the above test in svc_revisit(), and before the test inside
svc_xprt_enqueue(). What's preventing a race there?
I suppose one way to fix it would be to hold the xprt->xpt_lock across
the above test, and to make sure that you set XPT_DEFERRED while holding
the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
anything that races with the setting of XPT_DEAD.

Trond
I think this patch fixes this. Thanks again,

From: Tom Tucker <tom@xxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 5 Jan 2009 10:56:03 -0600
Subject: [PATCH] svc: Clean up deferred requests on transport destruction

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <tom@xxxxxxxxxxxxxxxxxxxxx>

---
  net/sunrpc/svc_xprt.c |   29 ++++++++++++++++++++++-------
  1 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index bf5b5cd..375a695 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
  void svc_delete_xprt(struct svc_xprt *xprt)
  {
         struct svc_serv *serv = xprt->xpt_server;
+       struct svc_deferred_req *dr;
+
+       /* Only do this once */
+       spin_lock(&xprt->xpt_lock);
+       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
+               spin_unlock(&xprt->xpt_lock);
+               return;
+       }
+       spin_unlock(&xprt->xpt_lock);
You shouldn't need to take the spinlock here if you just move the line
	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
in svc_revisit(). See below...

I'm confused...sorry.

This lock in intended to avoid the following race:

				revisit:
					- test_bit == 0
svc_delete_xprt:
	- test_and_set_bit == 0
	- iterates over deferred queue,
	  but there's nothing in it yet
	  to clean up.

					- Adds deferred request to transport's
					  deferred list.
					- enqueue fails because XPT_DEAD is set

Now we've got a dangling reference.

The lock forces the delete to wait until the revisit had
added the deferral to the transport list.


         dprintk("svc: svc_delete_xprt(%p)\n", xprt);
         xprt->xpt_ops->xpo_detach(xprt);
@@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
          * while still attached to a queue, the queue itself
          * is about to be destroyed (in svc_destroy).
          */
-       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
-               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
-               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
-                       serv->sv_tmpcnt--;
+       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+               serv->sv_tmpcnt--;
+
+       for (dr = svc_deferred_dequeue(xprt); dr;
+            dr = svc_deferred_dequeue(xprt)) {
                 svc_xprt_put(xprt);
+               kfree(dr);
         }
+
+       svc_xprt_put(xprt);
         spin_unlock_bh(&serv->sv_lock);
  }

@@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
                 container_of(dreq, struct svc_deferred_req, handle);
         struct svc_xprt *xprt = dr->xprt;

-       if (too_many) {
+       spin_lock(&xprt->xpt_lock);
 +        set_bit(XPT_DEFERRED, &xprt->xpt_flags);

Given the above, how does this avoid the race?

By setting XPT_DEFERRED, you will force svc_deferred_dequeue to wait for
the xprt->xpt_lock, which you are already holding. At that point, it
would be OK for the test of XPT_DEAD to race, since you are still
holding the xpt_lock, so the loop over svc_deferred_dequeue() will catch
it...


Right. I should have looked more carefully.


Cheers
  Trond



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux