On 6/28/22 22:18, Liu Song wrote:
From: Liu Song <liusong@xxxxxxxxxxxxxxxxx> In "__blk_mq_delay_run_hw_queue", BLK_MQ_S_STOPPED is checked first, and then queue work, but in "blk_mq_stop_hw_queue", execute cancel work first and then set BLK_MQ_S_STOPPED, so there is a risk of queue work after setting BLK_MQ_S_STOPPED, which can be solved by adjusting the order. Signed-off-by: Liu Song <liusong@xxxxxxxxxxxxxxxxx> --- block/blk-mq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 93d9d60..865915e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2258,9 +2258,9 @@ bool blk_mq_queue_stopped(struct request_queue *q) */ void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx) { - cancel_delayed_work(&hctx->run_work); - set_bit(BLK_MQ_S_STOPPED, &hctx->state); + + cancel_delayed_work(&hctx->run_work); } EXPORT_SYMBOL(blk_mq_stop_hw_queue);
What made you come up with this patch? Source code reading or something else? Please mention this in the patch description. Regarding the above patch, I don't think this patch fixes the existing race between blk_mq_stop_hw_queue() and __blk_mq_delay_run_hw_queue(), not even if cancel_delayed_work_sync() would be used. The comment block above blk_mq_stop_hw_queue() clearly mentions that it is not guaranteed that this function stops dispatching of requests immediately. So why bother about fixing the existing race conditions that do not affect what is guaranteed by blk_mq_stop_hw_queue()? Thanks, Bart.