Re: [PATCH v3 26/44] SUNRPC: Improve latency for interactive tasks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Dec 31, 2018, at 1:09 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Thu, 2018-12-27 at 17:34 -0500, Chuck Lever wrote:
>>> On Dec 27, 2018, at 5:14 PM, Trond Myklebust <
>>> trondmy@xxxxxxxxxxxxxxx> wrote:
>>> 
>>> 
>>> 
>>>> On Dec 27, 2018, at 20:21, Chuck Lever <chuck.lever@xxxxxxxxxx>
>>>> wrote:
>>>> 
>>>> Hi Trond-
>>>> 
>>>> I've chased down a couple of remaining regressions with the v4.20
>>>> NFS client,
>>>> and they seem to be rooted in this commit.
>>>> 
>>>> When using sec=krb5, krb5i, or krb5p I found that multi-threaded
>>>> workloads
>>>> trigger a lot of server-side disconnects. This is with TCP and
>>>> RDMA transports.
>>>> An instrumented server shows that the client is under-running the
>>>> GSS sequence
>>>> number window. I monitored the order in which GSS sequence
>>>> numbers appear on
>>>> the wire, and after this commit, the sequence numbers are wildly
>>>> misordered.
>>>> If I revert the hunk in xprt_request_enqueue_transmit, the
>>>> problem goes away.
>>>> 
>>>> I also found that reverting that hunk results in a 3-4%
>>>> improvement in fio
>>>> IOPS rates, as well as improvement in average and maximum latency
>>>> as reported
>>>> by fio.
>>>> 
>>> 
>>> Hmm… Provided the sequence numbers still lie within the window,
>>> then why would the order matter?
>> 
>> The misordering is so bad that one request is delayed long enough to
>> fall outside the window. The new “need re-encode” logic does not
>> trigger.
>> 
> 
> That's weird. I can't see anything wrong with need re-encode at this
> point.

I don't think there is anything wrong with it, it looks like it's
not called in this case.


> Do the window sizes agree on the client and the server?

Yes, both are 128. I also tried with 64 on the client side and 128
on the server side. That reduces the frequency of disconnects, but
does not eliminate them.

I'm not clear what problem the logic in xprt_request_enqueue_transmit
is trying to address. It seems to me that the initial, simple
implementation of this function is entirely adequate..?


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux