Re: [PATCH v2 23/34] SUNRPC: Move RPC retransmission stat counter to xprt_transmit()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 5, 2018, at 12:07 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Wed, 2018-09-05 at 11:31 -0400, Chuck Lever wrote:
>>> On Sep 5, 2018, at 11:28 AM, Trond Myklebust <
>>> trondmy@xxxxxxxxxxxxxxx> wrote:
>>> 
>>> On Wed, 2018-09-05 at 10:30 -0400, Chuck Lever wrote:
>>>>> On Sep 4, 2018, at 5:05 PM, Trond Myklebust <trondmy@xxxxxxxxx>
>>>>> wrote:
>>>>> 
>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx
>>>>>> 
>>>>> ---
>>>>> net/sunrpc/clnt.c |  6 ------
>>>>> net/sunrpc/xprt.c | 13 ++++++-------
>>>>> 2 files changed, 6 insertions(+), 13 deletions(-)
>>>>> 
>>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>>> index 032b7042adb6..21713bed812a 100644
>>>>> --- a/net/sunrpc/clnt.c
>>>>> +++ b/net/sunrpc/clnt.c
>>>>> @@ -1967,8 +1967,6 @@ call_connect_status(struct rpc_task
>>>>> *task)
>>>>> static void
>>>>> call_transmit(struct rpc_task *task)
>>>>> {
>>>>> -	int is_retrans = RPC_WAS_SENT(task);
>>>>> -
>>>>> 	dprint_status(task);
>>>>> 
>>>>> 	task->tk_action = call_transmit_status;
>>>>> @@ -1979,10 +1977,6 @@ call_transmit(struct rpc_task *task)
>>>>> 	if (!xprt_prepare_transmit(task))
>>>>> 		return;
>>>>> 	xprt_transmit(task);
>>>>> -	if (task->tk_status < 0)
>>>>> -		return;
>>>>> -	if (is_retrans)
>>>>> -		task->tk_client->cl_stats->rpcretrans++;
>>>>> }
>>>>> 
>>>>> /*
>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>> index f1301d391399..e2f5b4668a66 100644
>>>>> --- a/net/sunrpc/xprt.c
>>>>> +++ b/net/sunrpc/xprt.c
>>>>> @@ -191,8 +191,6 @@ int xprt_reserve_xprt(struct rpc_xprt
>>>>> *xprt,
>>>>> struct rpc_task *task)
>>>>> 		goto out_sleep;
>>>>> 	}
>>>>> 	xprt->snd_task = task;
>>>>> -	if (req != NULL)
>>>>> -		req->rq_ntrans++;
>>>>> 
>>>>> 	return 1;
>>>>> 
>>>>> @@ -247,7 +245,6 @@ int xprt_reserve_xprt_cong(struct rpc_xprt
>>>>> *xprt, struct rpc_task *task)
>>>>> 	}
>>>>> 	if (__xprt_get_cong(xprt, task)) {
>>>>> 		xprt->snd_task = task;
>>>>> -		req->rq_ntrans++;
>>>>> 		return 1;
>>>>> 	}
>>>>> 	xprt_clear_locked(xprt);
>>>>> @@ -281,12 +278,8 @@ static inline int xprt_lock_write(struct
>>>>> rpc_xprt *xprt, struct rpc_task *task)
>>>>> static bool __xprt_lock_write_func(struct rpc_task *task, void
>>>>> *data)
>>>>> {
>>>>> 	struct rpc_xprt *xprt = data;
>>>>> -	struct rpc_rqst *req;
>>>>> 
>>>>> -	req = task->tk_rqstp;
>>>>> 	xprt->snd_task = task;
>>>>> -	if (req)
>>>>> -		req->rq_ntrans++;
>>>>> 	return true;
>>>>> }
>>>>> 
>>>>> @@ -1126,6 +1119,7 @@ void xprt_transmit(struct rpc_task *task)
>>>>> 	struct rpc_rqst	*req = task->tk_rqstp;
>>>>> 	struct rpc_xprt	*xprt = req->rq_xprt;
>>>>> 	unsigned int connect_cookie;
>>>>> +	int is_retrans = RPC_WAS_SENT(task);
>>>>> 	int status;
>>>>> 
>>>>> 	dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid,
>>>>> req-
>>>>>> rq_slen);
>>>>> 
>>>>> @@ -1140,6 +1134,8 @@ void xprt_transmit(struct rpc_task *task)
>>>>> 		}
>>>>> 	}
>>>>> 
>>>>> +	req->rq_ntrans++;
>>>>> +
>>>> 
>>>> rq_ntrans is used in two places:
>>>> 
>>>> 1. rpc_update_rtt
>>>> 2. rpc_count_iostats_metrics
>>>> 
>>>> Both of these appear to assume that rq_ntrans is counting
>>>> full successful transmissions (ie, calls that appear on the
>>>> wire), not invocations of ->send_request().
>>>> 
>>>> Can this counter be moved down to where rpcretrans is updated?
>>>> 
>>> 
>>> The reason why I don't want to update req->rq_ntrans after
>>> transmission
>>> is because that would race with the reply from the server (which is
>>> not
>>> blocked by the XPRT_LOCK). In that race scenario, rpc_update_rtt()
>>> could update the RTT value before we've updated the req->rq_ntrans,
>>> meaning that we might be measuring the response time against a
>>> reply to
>>> the original transmission or to the retransmission that was just
>>> sent.
>>> That would end up breaking Van Jacobsen congestion control, since
>>> we
>>> risk significantly underestimating the RTT value.
>> 
>> But for transports that don't use xprt_update_rtt, error
>> returns from ->send_request() (like -EAGAIN or -ENOBUFS)
>> means that rq_ntrans is counted twice, and it shows up as
>> false retransmissions.
> 
> That's the way it has always worked.

Not quite: the current xprt_reserve_xprt_cong has a latch -- rq_cong.
req->rq_ntrans is bumped only once when it gets a congestion credit.

The transport send lock is held until enough ->send_request calls
have been done to transmit the whole Call message, but rq_ntrans
is bumped only once per full RPC Call message.

Note that this code is a little dodgy for xprt_reserve_xprt: when
driving a connect, the RPC task takes and releases the transport
send lock twice, and both times it bumps rq_ntrans, which is a bug.
This patch appears to fix that.


> The Van Jacobsen code was
> implemented long before the per-request stats were introduced, and so
> the behaviour of req->rq_ntrans has always been to update the value
> before we transmit.

>> What if rq_ntrans was bumped in the transport code instead?
> 
> If we must change the behaviour, then why not just have the generic
> layer 'unbump' the counter on an incomplete transmission?

OK. There's no way for a reply to arrive if the call hasn't been
fully transmitted, so a race is avoided in that case.

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux