[PATCH RFC 1/2] xprtrdma: Check inline size before providing a Write chunk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In very rare cases, an NFS READ operation might predict that the
non-payload part of the RPC Call is large. For instance, an
NFSv4 COMPOUND with a large GETATTR result, in combination with a
large Kerberos credential, could push the non-payload part to be
several kilobytes.

If the non-payload part is larger than the connection's inline
threshold, the client is required to provision a Reply chunk. The
current Linux client does not check for this case. There are two
obvious ways to handle it:

a. Provision a Write chunk for the payload and a Reply chunk for
   the non-payload part

b. Provision a Reply chunk for the whole RPC Reply

Some testing at a recent NFS bake-a-thon showed that servers can
mostly handle a. but there are some corner cases that do not work
yet. b. already works (it has to, to handle krb5i/p), but could be
somewhat less efficient. However, I expect this scenario to be very
rare -- no-one has reported a problem yet.

So I'm going to implement b. Sometime later I will provide some
patches to help make b. a little more efficient by more carefully
choosing the Reply chunk's segment sizes to ensure the payload is
optimally aligned.

Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
---
 net/sunrpc/xprtrdma/rpc_rdma.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index d18614e..7774aee 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -164,6 +164,21 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 	return rqst->rq_rcv_buf.buflen <= ia->ri_max_inline_read;
 }
 
+/* The client is required to provide a Reply chunk if the maximum
+ * size of the non-payload part of the RPC Reply is larger than
+ * the inline threshold.
+ */
+static bool
+rpcrdma_nonpayload_inline(const struct rpcrdma_xprt *r_xprt,
+			  const struct rpc_rqst *rqst)
+{
+	const struct xdr_buf *buf = &rqst->rq_rcv_buf;
+	const struct rpcrdma_ia *ia = &r_xprt->rx_ia;
+
+	return buf->head[0].iov_len + buf->tail[0].iov_len <
+		ia->ri_max_inline_read;
+}
+
 /* Split @vec on page boundaries into SGEs. FMR registers pages, not
  * a byte range. Other modes coalesce these SGEs into a single MR
  * when they can.
@@ -762,7 +777,8 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 	 */
 	if (rpcrdma_results_inline(r_xprt, rqst))
 		wtype = rpcrdma_noch;
-	else if (ddp_allowed && rqst->rq_rcv_buf.flags & XDRBUF_READ)
+	else if ((ddp_allowed && rqst->rq_rcv_buf.flags & XDRBUF_READ) &&
+		 rpcrdma_nonpayload_inline(r_xprt, rqst))
 		wtype = rpcrdma_writech;
 	else
 		wtype = rpcrdma_replych;




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux