On Wed, 2018-09-05 at 11:31 -0400, Chuck Lever wrote: > > On Sep 5, 2018, at 11:28 AM, Trond Myklebust < > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > On Wed, 2018-09-05 at 10:30 -0400, Chuck Lever wrote: > > > > On Sep 4, 2018, at 5:05 PM, Trond Myklebust <trondmy@xxxxxxxxx> > > > > wrote: > > > > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx > > > > > > > > > --- > > > > net/sunrpc/clnt.c | 6 ------ > > > > net/sunrpc/xprt.c | 13 ++++++------- > > > > 2 files changed, 6 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > > > > index 032b7042adb6..21713bed812a 100644 > > > > --- a/net/sunrpc/clnt.c > > > > +++ b/net/sunrpc/clnt.c > > > > @@ -1967,8 +1967,6 @@ call_connect_status(struct rpc_task > > > > *task) > > > > static void > > > > call_transmit(struct rpc_task *task) > > > > { > > > > - int is_retrans = RPC_WAS_SENT(task); > > > > - > > > > dprint_status(task); > > > > > > > > task->tk_action = call_transmit_status; > > > > @@ -1979,10 +1977,6 @@ call_transmit(struct rpc_task *task) > > > > if (!xprt_prepare_transmit(task)) > > > > return; > > > > xprt_transmit(task); > > > > - if (task->tk_status < 0) > > > > - return; > > > > - if (is_retrans) > > > > - task->tk_client->cl_stats->rpcretrans++; > > > > } > > > > > > > > /* > > > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c > > > > index f1301d391399..e2f5b4668a66 100644 > > > > --- a/net/sunrpc/xprt.c > > > > +++ b/net/sunrpc/xprt.c > > > > @@ -191,8 +191,6 @@ int xprt_reserve_xprt(struct rpc_xprt > > > > *xprt, > > > > struct rpc_task *task) > > > > goto out_sleep; > > > > } > > > > xprt->snd_task = task; > > > > - if (req != NULL) > > > > - req->rq_ntrans++; > > > > > > > > return 1; > > > > > > > > @@ -247,7 +245,6 @@ int xprt_reserve_xprt_cong(struct rpc_xprt > > > > *xprt, struct rpc_task *task) > > > > } > > > > if (__xprt_get_cong(xprt, task)) { > > > > xprt->snd_task = task; > > > > - req->rq_ntrans++; > > > > return 1; > > > > } > > > > xprt_clear_locked(xprt); > > > > @@ -281,12 +278,8 @@ static inline int xprt_lock_write(struct > > > > rpc_xprt *xprt, struct rpc_task *task) > > > > static bool __xprt_lock_write_func(struct rpc_task *task, void > > > > *data) > > > > { > > > > struct rpc_xprt *xprt = data; > > > > - struct rpc_rqst *req; > > > > > > > > - req = task->tk_rqstp; > > > > xprt->snd_task = task; > > > > - if (req) > > > > - req->rq_ntrans++; > > > > return true; > > > > } > > > > > > > > @@ -1126,6 +1119,7 @@ void xprt_transmit(struct rpc_task *task) > > > > struct rpc_rqst *req = task->tk_rqstp; > > > > struct rpc_xprt *xprt = req->rq_xprt; > > > > unsigned int connect_cookie; > > > > + int is_retrans = RPC_WAS_SENT(task); > > > > int status; > > > > > > > > dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid, > > > > req- > > > > > rq_slen); > > > > > > > > @@ -1140,6 +1134,8 @@ void xprt_transmit(struct rpc_task *task) > > > > } > > > > } > > > > > > > > + req->rq_ntrans++; > > > > + > > > > > > rq_ntrans is used in two places: > > > > > > 1. rpc_update_rtt > > > 2. rpc_count_iostats_metrics > > > > > > Both of these appear to assume that rq_ntrans is counting > > > full successful transmissions (ie, calls that appear on the > > > wire), not invocations of ->send_request(). > > > > > > Can this counter be moved down to where rpcretrans is updated? > > > > > > > The reason why I don't want to update req->rq_ntrans after > > transmission > > is because that would race with the reply from the server (which is > > not > > blocked by the XPRT_LOCK). In that race scenario, rpc_update_rtt() > > could update the RTT value before we've updated the req->rq_ntrans, > > meaning that we might be measuring the response time against a > > reply to > > the original transmission or to the retransmission that was just > > sent. > > That would end up breaking Van Jacobsen congestion control, since > > we > > risk significantly underestimating the RTT value. > > But for transports that don't use xprt_update_rtt, error > returns from ->send_request() (like -EAGAIN or -ENOBUFS) > means that rq_ntrans is counted twice, and it shows up as > false retransmissions. That's the way it has always worked. The Van Jacobsen code was implemented long before the per-request stats were introduced, and so the behaviour of req->rq_ntrans has always been to update the value before we transmit. > What if rq_ntrans was bumped in the transport code instead? If we must change the behaviour, then why not just have the generic layer 'unbump' the counter on an incomplete transmission? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx