On Mon, 25 Oct 2010 19:03:35 -0400 "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > On Tue, Oct 26, 2010 at 09:58:36AM +1100, Neil Brown wrote: > > On Mon, 25 Oct 2010 16:21:56 -0400 > > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > > > > > On Mon, Oct 25, 2010 at 12:43:57PM +1100, Neil Brown wrote: > > > > On Sun, 24 Oct 2010 21:21:30 -0400 > > > > "J. Bruce Fields" <bfields@xxxxxxxxxx> wrote: > > > > > > > > > Once an xprt has been deleted, there's no reason to allow it to be > > > > > enqueued--at worst, that might cause the xprt to be re-added to some > > > > > global list, resulting in later corruption. > > > > > > > > > > Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx> > > > > > > > > Yep, this makes svc_close_xprt() behave the same way as svc_recv() which > > > > calls svc_delete_xprt but does not clear XPT_BUSY. The other branches in > > > > svc_recv call svc_xprt_received, but the XPT_CLOSE branch doesn't > > > > > > > > Reviewed-by: NeilBrown <neilb@xxxxxxx> > > > > > > Also, of course: > > > > > > > > svc_xprt_get(xprt); > > > > > svc_delete_xprt(xprt); > > > > > - clear_bit(XPT_BUSY, &xprt->xpt_flags); > > > > > svc_xprt_put(xprt); > > > > > > The get/put is pointless: the only reason I can see for doing that of > > > course was to be able to safely clear the bit afterwards. > > > > > > > Agreed. > > > > I like patches that get rid of code!! > > Unfortunately, I'm stuck on just one more point: is svc_close_all() > really safe? It assumes it doesn't need any locking to speak of any > more because the server threads are gone--but the xprt's themselves > could still be producing events, right? (So data could be arriving that > results in calls to svc_xprt_enqueue, for example?) > > If that's right, I'm not sure what to do there.... > > --b. Yes, svc_close_all is racy w.r.t. svc_xprt_enqueue. I guess we've never lost that race? The race happens if the test_and_set(XPT_BUSY) in svc_xprt_enqueue happens before the test_bit(XPT_BUSY) in svc_close_all, but the list_add_tail at the end of svc_xprt_enqueue happens before (or during!) the list_del_init in svc_close_all. We cannot really lock against this race as svc_xprt_enqueue holds the pool lock, and svc_close_all doesn't know which pool to lock (as xprt->pool isn't set until after XPT_BUSY is set). Maybe we just need to lock all pools in that case?? So svc_close_all becomes something like: void svc_close_all(struct list_head *xprt_list) { struct svc_xprt *xprt; struct svc_xprt *tmp; struct svc_pool *pool; list_for_each_entry_safe(xprt, tmp, xprt_list, xpt_list) { set_bit(XPT_CLOSE, &xprt->xpt_flags); if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags)) { /* Waiting to be processed, but no threads left, * So just remove it from the waiting list. First * we need to ensure svc_xprt_enqueue isn't still * queuing the xprt to some pool. */ for_each_pool(pool, xprt->xpt_server) { spin_lock(&pool->sp_lock); spin_unlock(&pool->sp_lock); } list_del_init(&xprt->xpt_ready); } svc_delete_xprt(xprt); } } Note that once we always set XPT_BUSY and it stays set. So we call svc_delete_xprt instread of svc_close_xprt. Maybe we don't actually need to list_del_init - both the pool and the xprt will soon be freed and if there is linkage between them, who cares?? In that case we wouldn't need to for_each_pool after all ??? NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html