> On Mar 1, 2020, at 1:09 PM, Tom Talpey <tom@xxxxxxxxxx> wrote: > > On 2/21/2020 2:00 PM, Chuck Lever wrote: >> Howdy. >> I've had reports (and personal experience) where the Linux NFS/RDMA >> client waits for a very long time after a disruption of the network >> or NFS server. >> There is a disconnect time wait in the Connection Manager which >> blocks the RPC/RDMA transport from tearing down a connection for a >> few minutes when the remote cannot respond to DREQ messages. > > This seems really unfortunate. Why such a long wait in the RDMA layer? > I can see a backoff, to prevent connection attempt flooding, but a > constant "few minute" pause is a very blunt instrument. The last clause here is the operative conundrum: "when the remote cannot respond". That should be pretty rare, but it's frequent enough to be bothersome in some environments. As to why the time wait is so long, I don't know the answer to that. >> An RPC/RDMA transport has only one slot for connection state, so the >> transport is prevented from establishing a fresh connection until >> the time wait completes. >> This patch series refactors the connection end point data structures >> to enable one active and multiple zombie connections. Now, while a >> defunct connection is waiting to die, it is separated from the >> transport, clearing the way for the immediate creation of a new >> connection. Clean-up of the old connection's data structures and >> resources then completes in the background. > > This is a good idea in any case. It separates the layers, and leads > to better connection establishment throughput. > > Does the RPCRDMA layer ensure it backs off, if connection retries > fail? Or are you depending on the NFS upper layer for this. There is a complicated back-off scheme that is modeled on the TCP connection back-off logic. > Tom. > >> Well, that's the idea, anyway. Review and comments welcome. Hoping >> this can be merged in v5.7. >> --- >> Chuck Lever (11): >> xprtrdma: Invoke rpcrdma_ep_create() in the connect worker >> xprtrdma: Refactor frwr_init_mr() >> xprtrdma: Clean up the post_send path >> xprtrdma: Refactor rpcrdma_ep_connect() and rpcrdma_ep_disconnect() >> xprtrdma: Allocate Protection Domain in rpcrdma_ep_create() >> xprtrdma: Invoke rpcrdma_ia_open in the connect worker >> xprtrdma: Remove rpcrdma_ia::ri_flags >> xprtrdma: Disconnect on flushed completion >> xprtrdma: Merge struct rpcrdma_ia into struct rpcrdma_ep >> xprtrdma: Extract sockaddr from struct rdma_cm_id >> xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt >> include/trace/events/rpcrdma.h | 97 ++--- >> net/sunrpc/xprtrdma/backchannel.c | 8 >> net/sunrpc/xprtrdma/frwr_ops.c | 152 ++++---- >> net/sunrpc/xprtrdma/rpc_rdma.c | 32 +- >> net/sunrpc/xprtrdma/transport.c | 72 +--- >> net/sunrpc/xprtrdma/verbs.c | 681 ++++++++++++++----------------------- >> net/sunrpc/xprtrdma/xprt_rdma.h | 89 ++--- >> 7 files changed, 445 insertions(+), 686 deletions(-) >> -- >> Chuck Lever -- Chuck Lever