On 10/25/22 6:21 PM, Ming Lei wrote: > On Tue, Oct 25, 2022 at 12:11:39PM -0600, Jens Axboe wrote: >> On 10/24/22 6:55 PM, Ming Lei wrote: >>> @@ -1593,10 +1598,18 @@ static void blk_mq_timeout_work(struct work_struct *work) >>> if (!percpu_ref_tryget(&q->q_usage_counter)) >>> return; >>> >>> - blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next); >>> + /* >>> + * Before walking tags, we must ensure any submit started before >>> + * the current time has finished. Since the submit uses srcu or rcu, >>> + * wait for a synchronization point to ensure all running submits >>> + * have finished >>> + */ >>> + blk_mq_wait_quiesce_done(q); >> >> I'm a little worried about this bit - so we'll basically do a sync RCU >> every time the timeout timer runs... Depending on machine load, that >> can take a long time. > > Yeah, the per-queue timeout timer is never canceled after request is > completed, so most of times the timeout work does nothing. Yep, it just keeps going, that's the point of the rolling timeout timer. > Can we run the sync RCU only if there is timed out request found? Then > the wait is only needed in case that timeout handling is required. Also > sync rcu is already done in some driver's ->timeout(). That was going to be my suggestion, if it can get done only for when there's actually a potential timeout candidate, then that would be orders of magnitude better. -- Jens Axboe _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization