In 2020, commit 7d81ee8722d6 ("svcrdma: Single-stage RDMA Read") changed svcrdma's Read chunk handler to wait, in nfsd thread context, for the completion of RDMA Reads from the client. The thought was that fewer context switches should make for more efficient Read chunk processing, since RDMA Read completion is typically very fast. What I neglected to observe at the time is that if a client should stop responding to RDMA Read requests or the RDMA transport should fail to convey them (say, due to congestion), the herd of waiting nfsd threads could result in a denial-of-service. This is why the original svcrdma design was multi-staged: the server schedules the RDMA Reads and then the nfsd thread is released for other work; then the Read completions wake up another nfsd thread to finish assembling the incoming RPC. This series of patches reverts commit 7d81ee8722d6 ("svcrdma: Single-stage RDMA Read") by replacing the current single-stage Read mechanism with a reimplementation of the original multi-stage design. Throughput and latency tests show a slight improvement with the new handler. --- Chuck Lever (4): svcrdma: Add back svc_rdma_recv_ctxt::rc_pages svcrdma: Add back svcxprt_rdma::sc_read_complete_q svcrdma: Copy construction of svc_rqst::rq_arg to rdma_read_complete() svcrdma: Implement multi-stage Read completion again include/linux/sunrpc/svc_rdma.h | 11 +- include/trace/events/rpcrdma.h | 1 + net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 167 +++++++++++++++++++++-- net/sunrpc/xprtrdma/svc_rdma_rw.c | 149 +++++++------------- net/sunrpc/xprtrdma/svc_rdma_transport.c | 1 + 5 files changed, 217 insertions(+), 112 deletions(-) -- Chuck Lever