On Thu, 2018-12-27 at 17:34 -0500, Chuck Lever wrote: > > On Dec 27, 2018, at 5:14 PM, Trond Myklebust < > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > On Dec 27, 2018, at 20:21, Chuck Lever <chuck.lever@xxxxxxxxxx> > > > wrote: > > > > > > Hi Trond- > > > > > > I've chased down a couple of remaining regressions with the v4.20 > > > NFS client, > > > and they seem to be rooted in this commit. > > > > > > When using sec=krb5, krb5i, or krb5p I found that multi-threaded > > > workloads > > > trigger a lot of server-side disconnects. This is with TCP and > > > RDMA transports. > > > An instrumented server shows that the client is under-running the > > > GSS sequence > > > number window. I monitored the order in which GSS sequence > > > numbers appear on > > > the wire, and after this commit, the sequence numbers are wildly > > > misordered. > > > If I revert the hunk in xprt_request_enqueue_transmit, the > > > problem goes away. > > > > > > I also found that reverting that hunk results in a 3-4% > > > improvement in fio > > > IOPS rates, as well as improvement in average and maximum latency > > > as reported > > > by fio. > > > > > > > Hmm… Provided the sequence numbers still lie within the window, > > then why would the order matter? > > The misordering is so bad that one request is delayed long enough to > fall outside the window. The new “need re-encode” logic does not > trigger. > That's weird. I can't see anything wrong with need re-encode at this point. Do the window sizes agree on the client and the server? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx