Re: [PATCH v2 4/8] blk-mq: fix blk_mq_quiesce_queue

Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx> · Sat, 27 May 2017 21:46:45 +0000

On Sat, 2017-05-27 at 22:21 +0800, Ming Lei wrote:
> It is required that no dispatch can happen any more once
> blk_mq_quiesce_queue() returns, and we don't have such requirement
> on APIs of stopping queue.
> 
> But blk_mq_quiesce_queue() still may not block/drain dispatch in the
> following cases:
> 
> - direct issue or BLK_MQ_S_START_ON_RUN
> - in theory, new RCU read-side critical sections may begin while
> synchronize_rcu() was waiting, and end after synchronize_rcu()
> returns, during the period dispatch still may happen

Hello Ming,

I think the title and the description of this patch are wrong. Since
the current queue quiescing mechanism works fine for drivers that do
not stop and restart a queue (e.g. SCSI and dm-core), please change the
title and description to reflect that the purpose of this patch is
to allow drivers that use the quiesce mechanism to restart a queue
without unquiescing it.

> @@ -209,6 +217,9 @@ void blk_mq_wake_waiters(struct request_queue *q)
>  	 * the queue are notified as well.
>  	 */
>  	wake_up_all(&q->mq_freeze_wq);
> +
> +	/* Forcibly unquiesce the queue to avoid having stuck requests */
> +	blk_mq_unquiesce_queue(q);
>  }

Should the block layer unquiesce a queue if a block driver hasn't 
done that before queue removal starts or should the block driver
itself do that? The block layer doesn't restart stopped queues from
inside blk_set_queue_dying() so why should it unquiesce a quiesced
queue?

>  bool blk_mq_can_queue(struct blk_mq_hw_ctx *hctx)
> @@ -1108,13 +1119,15 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
>  
>  	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
>  		rcu_read_lock();
> -		blk_mq_sched_dispatch_requests(hctx);
> +		if (!blk_queue_quiesced(hctx->queue))
> +			blk_mq_sched_dispatch_requests(hctx);
>  		rcu_read_unlock();
>  	} else {
>  		might_sleep();
>  
>  		srcu_idx = srcu_read_lock(&hctx->queue_rq_srcu);
> -		blk_mq_sched_dispatch_requests(hctx);
> +		if (!blk_queue_quiesced(hctx->queue))
> +			blk_mq_sched_dispatch_requests(hctx);
>  		srcu_read_unlock(&hctx->queue_rq_srcu, srcu_idx);
>  	}
>  }

Sorry but I don't like these changes. Why have the blk_queue_quiesced()
calls be added at other code locations than the blk_mq_hctx_stopped() calls?
This will make the block layer unnecessary hard to maintain. Please consider
to change the blk_mq_hctx_stopped(hctx) calls in blk_mq_sched_dispatch_requests()
and *blk_mq_*run_hw_queue*() into blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q).

Thanks,

Bart.