[PATCH] blk-mq: run queue after issuing the last request of the plug list

Yufen Yu <yuyufen@xxxxxxxxxx> · Mon, 18 Jul 2022 20:35:28 +0800

We do test on a virtio scsi device (/dev/sda) and the default mq
scheduler is 'none'. We found a IO hung as following:

blk_finish_plug
  blk_mq_plug_issue_direct
      scsi_mq_get_budget
      //get budget_token fail and sdev->restarts=1

			     	 scsi_end_request
				   scsi_run_queue_async
                                   //sdev->restart=0 and run queue

     blk_mq_request_bypass_insert
        //add request to hctx->dispatch list

  //continue to dispath plug list
  blk_mq_dispatch_plug_list
      blk_mq_try_issue_list_directly
        //success issue all requests from plug list

After .get_budget fail, scsi_mq_get_budget will increase 'restarts'.
Normally, it will run hw queue when io complete and set 'restarts'
as 0. But if we run queue before adding request to the dispatch list
and blk_mq_dispatch_plug_list also success issue all requests, then
on one will run queue, and the request will be stall in the dispatch
list and cannot complete forever.

To fix the bug, we run queue after issuing the last request in
function blk_mq_sched_insert_requests.

Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx>
---
 block/blk-mq-sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index a4f7c101b53b..c3ad97ca2753 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -490,8 +490,8 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx,
 		blk_mq_insert_requests(hctx, ctx, list);
 	}
 
-	blk_mq_run_hw_queue(hctx, run_queue_async);
  out:
+	blk_mq_run_hw_queue(hctx, run_queue_async);
 	percpu_ref_put(&q->q_usage_counter);
 }
 
-- 
2.31.1