Re: [PATCH v4 1/2] SUNRPC: Fixup v4.1 backchannel request timeouts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4 Jan 2024, at 10:09, Chuck Lever wrote:

> On Thu, Jan 04, 2024 at 09:58:45AM -0500, Benjamin Coddington wrote:
>> After commit 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on
>> the sending list"), any 4.1 backchannel tasks placed on the sending queue
>                       ^^^
>
> "any" ? I found that this problem occurs only when the transport
> write lock is held (ie, when the forechannel is sending a Call).
> If the transport is idle, things work as expected. But OK, maybe
> your reproducer is different than mine.

Any that are _placed on the sending queue_.

> One more comment below.
>
>
>> would immediately return with -ETIMEDOUT since their req timers are zero.
>>
>> Initialize the backchannel's rpc_rqst timeout parameters from the xprt's
>> default timeout settings.
>>
>> Fixes: 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on the sending list")
>> Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx>
>> ---
>>  net/sunrpc/xprt.c | 23 ++++++++++++++---------
>>  1 file changed, 14 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index 2364c485540c..6cc9ffac962d 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -651,9 +651,9 @@ static unsigned long xprt_abs_ktime_to_jiffies(ktime_t abstime)
>>  		jiffies + nsecs_to_jiffies(-delta);
>>  }
>>
>> -static unsigned long xprt_calc_majortimeo(struct rpc_rqst *req)
>> +static unsigned long xprt_calc_majortimeo(struct rpc_rqst *req,
>> +		const struct rpc_timeout *to)
>>  {
>> -	const struct rpc_timeout *to = req->rq_task->tk_client->cl_timeout;
>>  	unsigned long majortimeo = req->rq_timeout;
>>
>>  	if (to->to_exponential)
>> @@ -665,9 +665,10 @@ static unsigned long xprt_calc_majortimeo(struct rpc_rqst *req)
>>  	return majortimeo;
>>  }
>>
>> -static void xprt_reset_majortimeo(struct rpc_rqst *req)
>> +static void xprt_reset_majortimeo(struct rpc_rqst *req,
>> +		const struct rpc_timeout *to)
>>  {
>> -	req->rq_majortimeo += xprt_calc_majortimeo(req);
>> +	req->rq_majortimeo += xprt_calc_majortimeo(req, to);
>>  }
>>
>>  static void xprt_reset_minortimeo(struct rpc_rqst *req)
>> @@ -675,7 +676,8 @@ static void xprt_reset_minortimeo(struct rpc_rqst *req)
>>  	req->rq_minortimeo += req->rq_timeout;
>>  }
>>
>> -static void xprt_init_majortimeo(struct rpc_task *task, struct rpc_rqst *req)
>> +static void xprt_init_majortimeo(struct rpc_task *task, struct rpc_rqst *req,
>> +		const struct rpc_timeout *to)
>>  {
>>  	unsigned long time_init;
>>  	struct rpc_xprt *xprt = req->rq_xprt;
>> @@ -684,8 +686,9 @@ static void xprt_init_majortimeo(struct rpc_task *task, struct rpc_rqst *req)
>>  		time_init = jiffies;
>>  	else
>>  		time_init = xprt_abs_ktime_to_jiffies(task->tk_start);
>> -	req->rq_timeout = task->tk_client->cl_timeout->to_initval;
>> -	req->rq_majortimeo = time_init + xprt_calc_majortimeo(req);
>> +
>> +	req->rq_timeout = to->to_initval;
>> +	req->rq_majortimeo = time_init + xprt_calc_majortimeo(req, to);
>>  	req->rq_minortimeo = time_init + req->rq_timeout;
>>  }
>>
>> @@ -713,7 +716,7 @@ int xprt_adjust_timeout(struct rpc_rqst *req)
>>  	} else {
>>  		req->rq_timeout = to->to_initval;
>>  		req->rq_retries = 0;
>> -		xprt_reset_majortimeo(req);
>> +		xprt_reset_majortimeo(req, to);
>>  		/* Reset the RTT counters == "slow start" */
>>  		spin_lock(&xprt->transport_lock);
>>  		rpc_init_rtt(req->rq_task->tk_client->cl_rtt, to->to_initval);
>> @@ -1886,7 +1889,7 @@ xprt_request_init(struct rpc_task *task)
>>  	req->rq_snd_buf.bvec = NULL;
>>  	req->rq_rcv_buf.bvec = NULL;
>>  	req->rq_release_snd_buf = NULL;
>> -	xprt_init_majortimeo(task, req);
>> +	xprt_init_majortimeo(task, req, task->tk_client->cl_timeout);
>>
>>  	trace_xprt_reserve(req);
>>  }
>> @@ -1996,6 +1999,8 @@ xprt_init_bc_request(struct rpc_rqst *req, struct rpc_task *task)
>>  	 */
>>  	xbufp->len = xbufp->head[0].iov_len + xbufp->page_len +
>>  		xbufp->tail[0].iov_len;
>> +
>
> +	/*
> +	 * Backchannel Replies are sent with !RPC_TASK_SOFT and
> +	 * RPC_TASK_NO_RETRANS_TIMEOUT. The major timeout setting
> +	 * affects only how long each Reply waits to be sent when
> +	 * a transport connection cannot be established.
> +	 */

I put this on 2/2 like I said in my earlier response.  I've been trying not
to make a delta on 1/2 (yes, even though its just a comment) because there's
a nonzero chance a maintainer is currently testing it to fix 6.7.  I
probably should not have made these two into a series, except that the 2nd
depends on the 1st.

If you definitely want it here instead, I will send a v5.  I think we're
probably going to be stuck with a broken 6.7 at this point.

Ben






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux