Re: [PATCH v3 26/44] SUNRPC: Improve latency for interactive tasks

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Mon, 31 Dec 2018 19:21:28 +0000



On Mon, 2018-12-31 at 19:18 +0000, Trond Myklebust wrote:
> On Mon, 2018-12-31 at 14:09 -0500, Chuck Lever wrote:
> > > On Dec 31, 2018, at 1:59 PM, Trond Myklebust <
> > > trondmy@xxxxxxxxxxxxxxx> wrote:
> > > 
> > > On Mon, 2018-12-31 at 13:44 -0500, Chuck Lever wrote:
> > > > > On Dec 31, 2018, at 1:09 PM, Trond Myklebust <
> > > > > trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > > 
> > > > > On Thu, 2018-12-27 at 17:34 -0500, Chuck Lever wrote:
> > > > > > > On Dec 27, 2018, at 5:14 PM, Trond Myklebust <
> > > > > > > trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > On Dec 27, 2018, at 20:21, Chuck Lever <
> > > > > > > > chuck.lever@xxxxxxxxxx>
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > Hi Trond-
> > > > > > > > 
> > > > > > > > I've chased down a couple of remaining regressions with
> > > > > > > > the
> > > > > > > > v4.20
> > > > > > > > NFS client,
> > > > > > > > and they seem to be rooted in this commit.
> > > > > > > > 
> > > > > > > > When using sec=krb5, krb5i, or krb5p I found that
> > > > > > > > multi-
> > > > > > > > threaded
> > > > > > > > workloads
> > > > > > > > trigger a lot of server-side disconnects. This is with
> > > > > > > > TCP
> > > > > > > > and
> > > > > > > > RDMA transports.
> > > > > > > > An instrumented server shows that the client is under-
> > > > > > > > running 
> > > > > > > > the
> > > > > > > > GSS sequence
> > > > > > > > number window. I monitored the order in which GSS
> > > > > > > > sequence
> > > > > > > > numbers appear on
> > > > > > > > the wire, and after this commit, the sequence numbers
> > > > > > > > are
> > > > > > > > wildly
> > > > > > > > misordered.
> > > > > > > > If I revert the hunk in xprt_request_enqueue_transmit,
> > > > > > > > the
> > > > > > > > problem goes away.
> > > > > > > > 
> > > > > > > > I also found that reverting that hunk results in a 3-4%
> > > > > > > > improvement in fio
> > > > > > > > IOPS rates, as well as improvement in average and
> > > > > > > > maximum
> > > > > > > > latency
> > > > > > > > as reported
> > > > > > > > by fio.
> > > > > > > > 
> > > > > > > 
> > > > > > > Hmm… Provided the sequence numbers still lie within the
> > > > > > > window,
> > > > > > > then why would the order matter?
> > > > > > 
> > > > > > The misordering is so bad that one request is delayed long
> > > > > > enough
> > > > > > to
> > > > > > fall outside the window. The new “need re-encode” logic
> > > > > > does
> > > > > > not
> > > > > > trigger.
> > > > > > 
> > > > > 
> > > > > That's weird. I can't see anything wrong with need re-encode
> > > > > at
> > > > > this
> > > > > point.
> > > > 
> > > > I don't think there is anything wrong with it, it looks like
> > > > it's
> > > > not called in this case.
> > > 
> > > So you are saying that the call to rpcauth_xmit_need_reencode()
> > > is
> > > triggering the EBADMSG, but that this fails to cause a re-encode
> > > of
> > > the
> > > message?
> > 
> > No, I think what's going on is that the need_reencode happens when
> > the
> > RPC is enqueued, and is successful.
> > 
> > But xprt_request_enqueue_transmit places the RPC somewhere in the
> > middle
> > of xmit_queue. xmit_queue is long enough that more than 128
> > requests
> > are
> > before the enqueued request.
> 
> The test for rpcauth_xmit_need_reencode() happens when we call
> xprt_request_transmit() to actually put the RPC call on the wire. The
> enqueue order should not be able to defeat that test.
> 
> Hmm... Is it perhaps the test for req->rq_bytes_sent that is failing
> because this is a retransmission after a disconnect/reconnect that
> didn't trigger a re-encode?

Actually, it might be worth a try to move the test for
rpcauth_xmit_need_reencode() outside the enclosing test for req-
>rq_bytes_sent as that is just a minor optimisation.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx