Re: [PATCH 1/4] block: fix request starvation when queue is stopped or quiesced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 11, 2024 at 06:19:18PM +0800, Muchun Song wrote:
> Supposing the following scenario with a virtio_blk driver.
> 
> CPU0                                    CPU1                                    CPU2
> 
> blk_mq_try_issue_directly()
>     __blk_mq_issue_directly()
>         q->mq_ops->queue_rq()
>             virtio_queue_rq()
>                 blk_mq_stop_hw_queue()
>                                         blk_mq_try_issue_directly()             virtblk_done()
>                                             if (blk_mq_hctx_stopped())
>     blk_mq_request_bypass_insert()                                                  blk_mq_start_stopped_hw_queue()
>     blk_mq_run_hw_queue()                                                               blk_mq_run_hw_queue()
>                                                 blk_mq_insert_request()
>                                                 return // Who is responsible for dispatching this IO request?
> 
> After CPU0 has marked the queue as stopped, CPU1 will see the queue is stopped.
> But before CPU1 puts the request on the dispatch list, CPU2 receives the interrupt
> of completion of request, so it will run the hardware queue and marks the queue
> as non-stopped. Meanwhile, CPU1 also runs the same hardware queue. After both CPU1
> and CPU2 complete blk_mq_run_hw_queue(), CPU1 just puts the request to the same
> hardware queue and returns. Seems it misses dispatching a request. Fix it by
> running the hardware queue explicitly. I think blk_mq_request_issue_directly()
> should handle a similar problem.
> 
> Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
> ---
>  block/blk-mq.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e3c3c0c21b553..b2d0f22de0c7f 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2619,6 +2619,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  
>  	if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) {
>  		blk_mq_insert_request(rq, 0);
> +		blk_mq_run_hw_queue(hctx, false);
>  		return;
>  	}
>  
> @@ -2649,6 +2650,7 @@ static blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
>  
>  	if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) {
>  		blk_mq_insert_request(rq, 0);
> +		blk_mq_run_hw_queue(hctx, false);
>  		return BLK_STS_OK;
>  	}

Looks one real issue, and the fix is fine:

Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>


Thanks, 
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux