Re: [PATCH v1 00/11] NFS/RDMA client side connection overhaul

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/21/2020 2:00 PM, Chuck Lever wrote:
Howdy.

I've had reports (and personal experience) where the Linux NFS/RDMA
client waits for a very long time after a disruption of the network
or NFS server.

There is a disconnect time wait in the Connection Manager which
blocks the RPC/RDMA transport from tearing down a connection for a
few minutes when the remote cannot respond to DREQ messages.

This seems really unfortunate. Why such a long wait in the RDMA layer?
I can see a backoff, to prevent connection attempt flooding, but a
constant "few minute" pause is a very blunt instrument.

An RPC/RDMA transport has only one slot for connection state, so the
transport is prevented from establishing a fresh connection until
the time wait completes.

This patch series refactors the connection end point data structures
to enable one active and multiple zombie connections. Now, while a
defunct connection is waiting to die, it is separated from the
transport, clearing the way for the immediate creation of a new
connection. Clean-up of the old connection's data structures and
resources then completes in the background.

This is a good idea in any case. It separates the layers, and leads
to better connection establishment throughput.

Does the RPCRDMA layer ensure it backs off, if connection retries
fail? Or are you depending on the NFS upper layer for this.

Tom.

Well, that's the idea, anyway. Review and comments welcome. Hoping
this can be merged in v5.7.

---

Chuck Lever (11):
       xprtrdma: Invoke rpcrdma_ep_create() in the connect worker
       xprtrdma: Refactor frwr_init_mr()
       xprtrdma: Clean up the post_send path
       xprtrdma: Refactor rpcrdma_ep_connect() and rpcrdma_ep_disconnect()
       xprtrdma: Allocate Protection Domain in rpcrdma_ep_create()
       xprtrdma: Invoke rpcrdma_ia_open in the connect worker
       xprtrdma: Remove rpcrdma_ia::ri_flags
       xprtrdma: Disconnect on flushed completion
       xprtrdma: Merge struct rpcrdma_ia into struct rpcrdma_ep
       xprtrdma: Extract sockaddr from struct rdma_cm_id
       xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt


  include/trace/events/rpcrdma.h    |   97 ++---
  net/sunrpc/xprtrdma/backchannel.c |    8
  net/sunrpc/xprtrdma/frwr_ops.c    |  152 ++++----
  net/sunrpc/xprtrdma/rpc_rdma.c    |   32 +-
  net/sunrpc/xprtrdma/transport.c   |   72 +---
  net/sunrpc/xprtrdma/verbs.c       |  681 ++++++++++++++-----------------------
  net/sunrpc/xprtrdma/xprt_rdma.h   |   89 ++---
  7 files changed, 445 insertions(+), 686 deletions(-)

--
Chuck Lever





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux