Re: [PATCH] blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues

Ming Lei <ming.lei@xxxxxxxxxx> · Tue, 8 Feb 2022 10:45:18 +0800

On Tue, Feb 1, 2022 at 4:34 AM David Jeffery <djeffery@xxxxxxxxxx> wrote:
>
> When blk_mq_delay_run_hw_queues sets an hctx to run in the future, it can
> reset the delay length for an already pending delayed work run_work. This
> creates a scenario where multiple hctx may have their queues set to run,
> but if one runs first and finds nothing to do, it can reset the delay of
> another hctx and stall the other hctx's ability to run requests.
>
> To avoid this I/O stall when an hctx's run_work is already pending,
> leave it untouched to run at its current designated time rather than
> extending its delay. The work will still run which keeps closed the race
> calling blk_mq_delay_run_hw_queues is needed for while also avoiding the
> I/O stall.
>
> Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx>
> ---
>  block/blk-mq.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
>
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f3bf3358a3bb..ae46eb4bf547 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2177,6 +2177,14 @@ void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs)
>         queue_for_each_hw_ctx(q, hctx, i) {
>                 if (blk_mq_hctx_stopped(hctx))
>                         continue;
> +               /*
> +                * If there is already a run_work pending, leave the
> +                * pending delay untouched. Otherwise, a hctx can stall
> +                * if another hctx is re-delaying the other's work
> +                * before the work executes.
> +                */
> +               if (delayed_work_pending(&hctx->run_work))
> +                       continue;

The issue is triggered on BFQ, since BFQ's has_work() may return true,
however its ->dispatch_request() may return NULL, so
blk_mq_delay_run_hw_queues()
is run for delay schedule.

In case of multiple hw queue, the described issue may be triggered, and cause io
stall for long time. And there are only 3 in-tree callers of
blk_mq_delay_run_hw_queues(),
David's fix works well for the 3 users, so this patch looks fine:

Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>

Thanks,