> On Dec 31, 2018, at 2:21 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Mon, 2018-12-31 at 19:18 +0000, Trond Myklebust wrote: >> On Mon, 2018-12-31 at 14:09 -0500, Chuck Lever wrote: >>>> On Dec 31, 2018, at 1:59 PM, Trond Myklebust < >>>> trondmy@xxxxxxxxxxxxxxx> wrote: >>>> >>>> On Mon, 2018-12-31 at 13:44 -0500, Chuck Lever wrote: >>>>>> On Dec 31, 2018, at 1:09 PM, Trond Myklebust < >>>>>> trondmy@xxxxxxxxxxxxxxx> wrote: >>>>>> >>>>>> On Thu, 2018-12-27 at 17:34 -0500, Chuck Lever wrote: >>>>>>>> On Dec 27, 2018, at 5:14 PM, Trond Myklebust < >>>>>>>> trondmy@xxxxxxxxxxxxxxx> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On Dec 27, 2018, at 20:21, Chuck Lever < >>>>>>>>> chuck.lever@xxxxxxxxxx> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi Trond- >>>>>>>>> >>>>>>>>> I've chased down a couple of remaining regressions with >>>>>>>>> the >>>>>>>>> v4.20 >>>>>>>>> NFS client, >>>>>>>>> and they seem to be rooted in this commit. >>>>>>>>> >>>>>>>>> When using sec=krb5, krb5i, or krb5p I found that >>>>>>>>> multi- >>>>>>>>> threaded >>>>>>>>> workloads >>>>>>>>> trigger a lot of server-side disconnects. This is with >>>>>>>>> TCP >>>>>>>>> and >>>>>>>>> RDMA transports. >>>>>>>>> An instrumented server shows that the client is under- >>>>>>>>> running >>>>>>>>> the >>>>>>>>> GSS sequence >>>>>>>>> number window. I monitored the order in which GSS >>>>>>>>> sequence >>>>>>>>> numbers appear on >>>>>>>>> the wire, and after this commit, the sequence numbers >>>>>>>>> are >>>>>>>>> wildly >>>>>>>>> misordered. >>>>>>>>> If I revert the hunk in xprt_request_enqueue_transmit, >>>>>>>>> the >>>>>>>>> problem goes away. >>>>>>>>> >>>>>>>>> I also found that reverting that hunk results in a 3-4% >>>>>>>>> improvement in fio >>>>>>>>> IOPS rates, as well as improvement in average and >>>>>>>>> maximum >>>>>>>>> latency >>>>>>>>> as reported >>>>>>>>> by fio. >>>>>>>>> >>>>>>>> >>>>>>>> Hmm… Provided the sequence numbers still lie within the >>>>>>>> window, >>>>>>>> then why would the order matter? >>>>>>> >>>>>>> The misordering is so bad that one request is delayed long >>>>>>> enough >>>>>>> to >>>>>>> fall outside the window. The new “need re-encode” logic >>>>>>> does >>>>>>> not >>>>>>> trigger. >>>>>>> >>>>>> >>>>>> That's weird. I can't see anything wrong with need re-encode >>>>>> at >>>>>> this >>>>>> point. >>>>> >>>>> I don't think there is anything wrong with it, it looks like >>>>> it's >>>>> not called in this case. >>>> >>>> So you are saying that the call to rpcauth_xmit_need_reencode() >>>> is >>>> triggering the EBADMSG, but that this fails to cause a re-encode >>>> of >>>> the >>>> message? >>> >>> No, I think what's going on is that the need_reencode happens when >>> the >>> RPC is enqueued, and is successful. >>> >>> But xprt_request_enqueue_transmit places the RPC somewhere in the >>> middle >>> of xmit_queue. xmit_queue is long enough that more than 128 >>> requests >>> are >>> before the enqueued request. >> >> The test for rpcauth_xmit_need_reencode() happens when we call >> xprt_request_transmit() to actually put the RPC call on the wire. The >> enqueue order should not be able to defeat that test. >> >> Hmm... Is it perhaps the test for req->rq_bytes_sent that is failing >> because this is a retransmission after a disconnect/reconnect that >> didn't trigger a re-encode? > > Actually, it might be worth a try to move the test for > rpcauth_xmit_need_reencode() outside the enclosing test for req- >> rq_bytes_sent as that is just a minor optimisation. Perhaps that's the case for TCP, but RPCs sent via xprtrdma never set req->rq_bytes_sent to a non-zero value. The body of the "if" statement is always executed for those RPCs. -- Chuck Lever