On Tue, Oct 25, 2022 at 12:11:39PM -0600, Jens Axboe wrote: > On 10/24/22 6:55 PM, Ming Lei wrote: > > @@ -1593,10 +1598,18 @@ static void blk_mq_timeout_work(struct work_struct *work) > > if (!percpu_ref_tryget(&q->q_usage_counter)) > > return; > > > > - blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next); > > + /* > > + * Before walking tags, we must ensure any submit started before > > + * the current time has finished. Since the submit uses srcu or rcu, > > + * wait for a synchronization point to ensure all running submits > > + * have finished > > + */ > > + blk_mq_wait_quiesce_done(q); > > I'm a little worried about this bit - so we'll basically do a sync RCU > every time the timeout timer runs... Depending on machine load, that > can take a long time. Yeah, the per-queue timeout timer is never canceled after request is completed, so most of times the timeout work does nothing. Can we run the sync RCU only if there is timed out request found? Then the wait is only needed in case that timeout handling is required. Also sync rcu is already done in some driver's ->timeout(). Thanks, Ming