On Tue, Jul 26, 2022 at 07:01:11PM +0800, Yufen Yu wrote: > We do test on a virtio scsi device (/dev/sda) and the default mq > scheduler is 'none'. We found a IO hung as following: > > blk_finish_plug > blk_mq_plug_issue_direct > scsi_mq_get_budget > //get budget_token fail and sdev->restarts=1 > > scsi_end_request > scsi_run_queue_async > //sdev->restart=0 and run queue > > blk_mq_request_bypass_insert > //add request to hctx->dispatch list > > //continue to dispath plug list > blk_mq_dispatch_plug_list > blk_mq_try_issue_list_directly > //success issue all requests from plug list > > After .get_budget fail, scsi_mq_get_budget will increase 'restarts'. > Normally, it will run hw queue when io complete and set 'restarts' > as 0. But if we run queue before adding request to the dispatch list > and blk_mq_dispatch_plug_list also success issue all requests, then > on one will run queue, and the request will be stall in the dispatch > list and cannot complete forever. The story isn't related with scsi actually. > > It is wrong to use last request of plug list to decide if run queue is > needed since all the remained requests in plug list may be from other > hctxs. To fix the bug, pass run_queue as true always to > blk_mq_request_bypass_insert(). > > Fix-suggested-by: Ming Lei <ming.lei@xxxxxxxxxx> > Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx> Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx> Thanks, Ming