Re: [PATCH v3 26/44] SUNRPC: Improve latency for interactive tasks

Chuck Lever <chuck.lever@xxxxxxxxxx> · Mon, 31 Dec 2018 14:09:08 -0500

> On Dec 31, 2018, at 1:59 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Mon, 2018-12-31 at 13:44 -0500, Chuck Lever wrote:
>>> On Dec 31, 2018, at 1:09 PM, Trond Myklebust <
>>> trondmy@xxxxxxxxxxxxxxx> wrote:
>>> 
>>> On Thu, 2018-12-27 at 17:34 -0500, Chuck Lever wrote:
>>>>> On Dec 27, 2018, at 5:14 PM, Trond Myklebust <
>>>>> trondmy@xxxxxxxxxxxxxxx> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Dec 27, 2018, at 20:21, Chuck Lever <
>>>>>> chuck.lever@xxxxxxxxxx>
>>>>>> wrote:
>>>>>> 
>>>>>> Hi Trond-
>>>>>> 
>>>>>> I've chased down a couple of remaining regressions with the
>>>>>> v4.20
>>>>>> NFS client,
>>>>>> and they seem to be rooted in this commit.
>>>>>> 
>>>>>> When using sec=krb5, krb5i, or krb5p I found that multi-
>>>>>> threaded
>>>>>> workloads
>>>>>> trigger a lot of server-side disconnects. This is with TCP
>>>>>> and
>>>>>> RDMA transports.
>>>>>> An instrumented server shows that the client is under-running 
>>>>>> the
>>>>>> GSS sequence
>>>>>> number window. I monitored the order in which GSS sequence
>>>>>> numbers appear on
>>>>>> the wire, and after this commit, the sequence numbers are
>>>>>> wildly
>>>>>> misordered.
>>>>>> If I revert the hunk in xprt_request_enqueue_transmit, the
>>>>>> problem goes away.
>>>>>> 
>>>>>> I also found that reverting that hunk results in a 3-4%
>>>>>> improvement in fio
>>>>>> IOPS rates, as well as improvement in average and maximum
>>>>>> latency
>>>>>> as reported
>>>>>> by fio.
>>>>>> 
>>>>> 
>>>>> Hmm… Provided the sequence numbers still lie within the window,
>>>>> then why would the order matter?
>>>> 
>>>> The misordering is so bad that one request is delayed long enough
>>>> to
>>>> fall outside the window. The new “need re-encode” logic does not
>>>> trigger.
>>>> 
>>> 
>>> That's weird. I can't see anything wrong with need re-encode at
>>> this
>>> point.
>> 
>> I don't think there is anything wrong with it, it looks like it's
>> not called in this case.
> 
> So you are saying that the call to rpcauth_xmit_need_reencode() is
> triggering the EBADMSG, but that this fails to cause a re-encode of the
> message?

No, I think what's going on is that the need_reencode happens when the
RPC is enqueued, and is successful.

But xprt_request_enqueue_transmit places the RPC somewhere in the middle
of xmit_queue. xmit_queue is long enough that more than 128 requests are
before the enqueued request.

>>> Do the window sizes agree on the client and the server?
>> 
>> Yes, both are 128. I also tried with 64 on the client side and 128
>> on the server side. That reduces the frequency of disconnects, but
>> does not eliminate them.
>> 
>> I'm not clear what problem the logic in xprt_request_enqueue_transmit
>> is trying to address. It seems to me that the initial, simple
>> implementation of this function is entirely adequate..?
> 
> I agree that the fair queueing code could result in a reordering that
> could screw up the RPCSEC_GSS sequencing. However, we do expect the
> need reencode stuff to catch that.

--
Chuck Lever