> On Jan 3, 2019, at 1:47 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Thu, 2019-01-03 at 13:29 -0500, Chuck Lever wrote: >> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >> index d5ce1a8..66b08aa 100644 >> --- a/net/sunrpc/xprtsock.c >> +++ b/net/sunrpc/xprtsock.c >> @@ -678,6 +678,31 @@ static void xs_stream_data_receive_workfn(struct >> work_struct *work) >> >> #define XS_SENDMSG_FLAGS (MSG_DONTWAIT | MSG_NOSIGNAL) >> >> +static int xs_send_record_marker(struct sock_xprt *transport, >> + const struct rpc_rqst *req) >> +{ >> + static struct msghdr msg = { >> + .msg_name = NULL, >> + .msg_namelen = 0, >> + .msg_flags = (XS_SENDMSG_FLAGS | MSG_MORE), >> + }; >> + rpc_fraghdr marker; >> + struct kvec iov = { >> + .iov_base = &marker, >> + .iov_len = sizeof(marker), >> + }; >> + u32 reclen; >> + >> + if (unlikely(!transport->sock)) >> + return -ENOTSOCK; >> + if (req->rq_bytes_sent) >> + return 0; > > The test needs to use transport->xmit.offset, not req->rq_bytes_sent. OK, that seems to work better. > You also need to update transport->xmit.offset on success, That causes the first 4 bytes of the rq_snd_buf to not be sent. Not updating xmit.offset seems more correct. > and be > prepared to handle the case where < sizeof(marker) bytes get > transmitted due to a write_space condition. Probably the only recourse is to break the connection. >> + >> + reclen = req->rq_snd_buf.len; >> + marker = cpu_to_be32(RPC_LAST_STREAM_FRAGMENT | reclen); >> + return kernel_sendmsg(transport->sock, &msg, &iov, 1, >> iov.iov_len); > > > So what does this do for performance? I'd expect that adding another > dive into the socket layer will come with penalties. NFSv3 on TCP, sec=sys, 56Gbs IBoIP, v4.20 + my v4.21 patches fio, 8KB random, 70% read, 30% write, 16 threads, iodepth=16 Without this patch: read: IOPS=28.7k, BW=224MiB/s (235MB/s)(11.2GiB/51092msec) write: IOPS=12.3k, BW=96.3MiB/s (101MB/s)(4918MiB/51092msec) With this patch: read: IOPS=28.6k, BW=224MiB/s (235MB/s)(11.2GiB/51276msec) write: IOPS=12.3k, BW=95.8MiB/s (100MB/s)(4914MiB/51276msec) Seems like that's in the noise. -- Chuck Lever