Re: [PATCH] blk-mq: avoid double ->queue_rq() because of early timeout

Jens Axboe <axboe@xxxxxxxxx> · Tue, 25 Oct 2022 18:53:23 -0600

On 10/25/22 6:21 PM, Ming Lei wrote:
> On Tue, Oct 25, 2022 at 12:11:39PM -0600, Jens Axboe wrote:
>> On 10/24/22 6:55 PM, Ming Lei wrote:
>>> @@ -1593,10 +1598,18 @@ static void blk_mq_timeout_work(struct work_struct *work)
>>>  	if (!percpu_ref_tryget(&q->q_usage_counter))
>>>  		return;
>>>  
>>> -	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next);
>>> +	/*
>>> +	 * Before walking tags, we must ensure any submit started before
>>> +	 * the current time has finished. Since the submit uses srcu or rcu,
>>> +	 * wait for a synchronization point to ensure all running submits
>>> +	 * have finished
>>> +	 */
>>> +	blk_mq_wait_quiesce_done(q);
>>
>> I'm a little worried about this bit - so we'll basically do a sync RCU
>> every time the timeout timer runs... Depending on machine load, that
>> can take a long time.
> 
> Yeah, the per-queue timeout timer is never canceled after request is
> completed, so most of times the timeout work does nothing.

Yep, it just keeps going, that's the point of the rolling timeout timer.

> Can we run the sync RCU only if there is timed out request found? Then
> the wait is only needed in case that timeout handling is required. Also
> sync rcu is already done in some driver's ->timeout().

That was going to be my suggestion, if it can get done only for when
there's actually a potential timeout candidate, then that would be
orders of magnitude better.

-- 
Jens Axboe
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization