On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote: > On Fri, 2009-01-02 at 15:44 -0600, Tom Tucker wrote: > > Bruce/Trond: > > > > This is an alternative to patches 2 and 3 from Trond's fix. I think > > Trond's fix is correct, but I believe this approach to be simpler. > > > > From: Tom Tucker <tom@xxxxxxxxxxxxxxxxxxxxx> > > Date: Wed, 31 Dec 2008 17:18:33 -0600 > > Subject: [PATCH] svc: Clean up deferred requests on transport destruction > > > > A race between svc_revisit and svc_delete_xprt can result in > > deferred requests holding references on a transport that can never be > > recovered because dead transports are not enqueued for subsequent > > processing. > > > > Check for XPT_DEAD in revisit to clean up completing deferrals on a dead > > transport and sweep a transport's deferred queue to do the same for queued > > but unprocessed deferrals. > > > > Signed-off-by: Tom Tucker <tom@xxxxxxxxxxxxxxxxxxxxx> > > --- > > net/sunrpc/svc_xprt.c | 20 +++++++++++++++----- > > 1 files changed, 15 insertions(+), 5 deletions(-) > > > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > > index bf5b5cd..92ca5c6 100644 > > --- a/net/sunrpc/svc_xprt.c > > +++ b/net/sunrpc/svc_xprt.c > > @@ -837,6 +837,11 @@ static void svc_age_temp_xprts(unsigned long closure) > > void svc_delete_xprt(struct svc_xprt *xprt) > > { > > struct svc_serv *serv = xprt->xpt_server; > > + struct svc_deferred_req *dr; > > + > > + /* Only do this once */ > > + if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) > > + return; > > > > dprintk("svc: svc_delete_xprt(%p)\n", xprt); > > xprt->xpt_ops->xpo_detach(xprt); > > @@ -851,12 +856,16 @@ void svc_delete_xprt(struct svc_xprt *xprt) > > * while still attached to a queue, the queue itself > > * is about to be destroyed (in svc_destroy). > > */ > > - if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) { > > - BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2); > > - if (test_bit(XPT_TEMP, &xprt->xpt_flags)) > > - serv->sv_tmpcnt--; > > + if (test_bit(XPT_TEMP, &xprt->xpt_flags)) > > + serv->sv_tmpcnt--; > > + > > + for (dr = svc_deferred_dequeue(xprt); dr; > > + dr = svc_deferred_dequeue(xprt)) { > > svc_xprt_put(xprt); > > + kfree(dr); > > } > > + > > + svc_xprt_put(xprt); > > spin_unlock_bh(&serv->sv_lock); > > } > > > > @@ -902,7 +911,8 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many) > > container_of(dreq, struct svc_deferred_req, handle); > > struct svc_xprt *xprt = dr->xprt; > > > > - if (too_many) { > > + if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) { > > + dprintk("revisit cancelled\n"); > > svc_xprt_put(xprt); > > kfree(dr); > > return; > > > > I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after > the above test in svc_revisit(), and before the test inside > svc_xprt_enqueue(). What's preventing a race there? I suppose one way to fix it would be to hold the xprt->xpt_lock across the above test, and to make sure that you set XPT_DEFERRED while holding the lock, and _before_ you test for XPT_DEAD. That way, you guarantee that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up anything that races with the setting of XPT_DEAD. Trond -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html