Re: [PATCH 1/2] block/mq-deadline: serialize request dispatching

Ming Lei <ming.lei@xxxxxxxxxx> · Fri, 19 Jan 2024 10:40:47 +0800

On Thu, Jan 18, 2024 at 11:04:56AM -0700, Jens Axboe wrote:
> If we're entering request dispatch but someone else is already
> dispatching, then just skip this dispatch. We know IO is inflight and
> this will trigger another dispatch event for any completion. This will
> potentially cause slightly lower queue depth for contended cases, but
> those are slowed down anyway and this should not cause an issue.
> 
> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> ---
>  block/mq-deadline.c | 28 +++++++++++++++++++++++-----
>  1 file changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index f958e79277b8..9e0ab3ea728a 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -79,10 +79,20 @@ struct dd_per_prio {
>  	struct io_stats_per_prio stats;
>  };
>  
> +enum {
> +	DD_DISPATCHING	= 0,
> +};
> +
>  struct deadline_data {
>  	/*
>  	 * run time data
>  	 */
> +	struct {
> +		spinlock_t lock;
> +		spinlock_t zone_lock;
> +	} ____cacheline_aligned_in_smp;
> +
> +	unsigned long run_state;
>  
>  	struct dd_per_prio per_prio[DD_PRIO_COUNT];
>  
> @@ -100,9 +110,6 @@ struct deadline_data {
>  	int front_merges;
>  	u32 async_depth;
>  	int prio_aging_expire;
> -
> -	spinlock_t lock;
> -	spinlock_t zone_lock;
>  };
>  
>  /* Maps an I/O priority class to a deadline scheduler priority. */
> @@ -600,6 +607,15 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
>  	struct request *rq;
>  	enum dd_prio prio;
>  
> +	/*
> +	 * If someone else is already dispatching, skip this one. This will
> +	 * defer the next dispatch event to when something completes, and could
> +	 * potentially lower the queue depth for contended cases.
> +	 */
> +	if (test_bit(DD_DISPATCHING, &dd->run_state) ||
> +	    test_and_set_bit(DD_DISPATCHING, &dd->run_state))
> +		return NULL;
> +

This patch looks fine.

BTW, the current dispatch is actually piggyback in the in-progress dispatch,
see blk_mq_do_dispatch_sched(). And the correctness should depend on
the looping dispatch & retry for nothing to dispatch in
blk_mq_do_dispatch_sched(), maybe we need to document it here.

Thanks,
Ming