On Mon, 15 Nov 2010 11:27:01 +1100 NeilBrown <neilb@xxxxxxx> wrote: > > Hi Bruce, > I found these while trying to diagnose why a customer hit the > BUG_ON(xprt->xpt_pool != NULL); > in svc_xprt_enqueue (in a 2.6.27 based kernel). I don't think either > of these actually explain the problem, but they seem to be bugs > none-the-less. > And I think I have now found the bug that caused the BUG_ON. The xprt has been freed and reused for something else so xpt_pool and other things are garbage. If you could review and apply this patch I would appreciate it. While exploring I noticed something that I thought was odd. In svc_rdma_transport.c, handle_connect_req calls rdma_create_xprt which will return an xprt with XPT_BUSY set. But I cannot see that XPT_BUSY is ever cleared. There is a comment saying that svc_xprt_received() cannot be called for some reason, but as far as I can see the reason is invalid, and svc_xprt_received is exactly what should be called to clear XPT_BUSY. But I don't really know this code so might be missing something important. Thanks, NeilBrown >From 0edb33a19f415a783d89839efbdf7c572a797043 Mon Sep 17 00:00:00 2001 From: NeilBrown <neilb@xxxxxxx> Date: Tue, 16 Nov 2010 16:55:19 +1100 Subject: [PATCH] sunrpc: close another server-size race. When an xprt is created, it has a refcount of 1, and XPT_BUSY is set. The refcount is *not* owned by the thread that created the xprt (as is clear from the fact that creators never put the reference). Rather, it is owned by the absence of XPT_DEAD. Once XPT_DEAD is set, (And XPT_BUSY is clear) that initial reference is dropped and the xprt can be freed. So when a creator clears XPT_BUSY it is dropping its only reference and so must not touch the xprt again. However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY reference on a new xprt), calls svc_xprt_recieved. This clears XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference. This is dangerous and has been seen to leave svc_xprt_enqueue working with an xprt containing garbage. So we need to hold an extra counted reference over that call to svc_xprt_received. For safety, any time we clear XPT_BUSY and then use the xprt again, we first get a reference, and the put it again afterwards. Note that svc_close_all does not need this extra protection as there are no threads running, and the final free can only be called asynchronously from such a thread. Signed-off-by: NeilBrown <neilb@xxxxxxx> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index ec9d8ef..fa249ca 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -213,6 +213,7 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name, spin_lock(&svc_xprt_class_lock); list_for_each_entry(xcl, &svc_xprt_class_list, xcl_list) { struct svc_xprt *newxprt; + unsigned short newport; if (strcmp(xprt_name, xcl->xcl_name)) continue; @@ -231,8 +232,9 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name, spin_lock_bh(&serv->sv_lock); list_add(&newxprt->xpt_list, &serv->sv_permsocks); spin_unlock_bh(&serv->sv_lock); + newport = svc_xprt_local_port(newxprt); clear_bit(XPT_BUSY, &newxprt->xpt_flags); - return svc_xprt_local_port(newxprt); + return newport; } err: spin_unlock(&svc_xprt_class_lock); @@ -420,8 +422,13 @@ static struct svc_xprt *svc_xprt_dequeue(struct svc_pool *pool) void svc_xprt_received(struct svc_xprt *xprt) { BUG_ON(!test_bit(XPT_BUSY, &xprt->xpt_flags)); + /* As soon as we clear busy, the xprt could be closed and + * 'put', so we need a reference to call svc_xprt_enqueue with + */ + svc_xprt_get(xprt); clear_bit(XPT_BUSY, &xprt->xpt_flags); svc_xprt_enqueue(xprt); + svc_xprt_put(xprt); } EXPORT_SYMBOL_GPL(svc_xprt_received); -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html