On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote: > Trond Myklebust wrote: > > Aside from being racy (there is nothing preventing someone setting XPT_DEAD > > after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is > > wrong to assume that transports which have called svc_delete_xprt() might > > not need to be re-enqueued. > > This is only true because now you allow transports with XPT_DEAD set to > be enqueued -- yes? > > > > > See the list of deferred requests, which is currently never going to > > be cleared if the revisit call happens after svc_delete_xprt(). In this > > case, the deferred request will currently keep a reference to the transport > > forever. > > > > I agree this is a possibility and it needs to be fixed. I'm concerned > that the root cause is still there though. I thought the test case was > the client side timing out the connection. Why are there deferred > requests sitting on what is presumably an idle connection? I haven't said that they are the cause of this test case. I've said that deferred requests hold references to the socket that can obviously deadlock. That needs to be fixed regardless of whether or not it is the cause here. There are plenty of situations in which the client may choose to close the TCP socket even if there are outstanding requests. One of the most common is when the user signals the process, so that an RPC call that was partially transmitted (ran out of buffer space) gets cancelled before it can finish transmitting. In that case the client has no choice but to disconnect and immediately reconnect. > > The fix should be to allow dead transports to be enqueued in order to clear > > the deferred requests, then change the order of processing in svc_recv() so > > that we pick up deferred requests before we do the XPT_CLOSE processing. > > > > Wouldn't it be simpler to clean up any deferred requests in the close > path instead of changing the meaning of XPT_DEAD and dispatching > N-threads to do the same? AFAICS, deferred requests are the property of the cache until they expire or a downcall occurs. I'm not aware of any way to cancel only those deferred requests that hold a reference to this particular transport. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html