Any request queue data structure may change while a queue is frozen. Hence make sure that blk_mq_run_hw_queues() does not access any hw queue while a request queue is frozen. After blk_cleanup_queue() has marked a queue as dead it is no longer safe to access the hardware queue data structures. This patch avoids that blk_mq_run_hw_queues() crashes when called during or after blk_cleanup_queue() has freed the hardware queues. This patch is a variant of a patch posted by Hannes Reinecke ("[PATCH] block: don't call blk_mq_run_hw_queues() for dead or dying queues "). This patch is similar in nature to commit c246e80d8673 ("block: Avoid that request_fn is invoked on a dead queue"; v3.8). An example of a crash that is fixed by this patch: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8135a10b>] sbitmap_any_bit_set+0xb/0x30 Call Trace: [<ffffffff81303a88>] blk_mq_run_hw_queues+0x48/0x90 [<ffffffff813053cc>] blk_mq_requeue_work+0x10c/0x120 [<ffffffff81098cb4>] process_one_work+0x154/0x410 [<ffffffff81099896>] worker_thread+0x116/0x4a0 [<ffffffff8109edb9>] kthread+0xc9/0xe0 [<ffffffff81619b05>] ret_from_fork+0x55/0x80 Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx> Cc: Hannes Reinecke <hare@xxxxxxxx> Cc: James Smart <james.smart@xxxxxxxxxxxx> Cc: Ming Lei <ming.lei@xxxxxxxxxx> Cc: Jianchao Wang <jianchao.w.wang@xxxxxxxxxx> Cc: Dongli Zhang <dongli.zhang@xxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller") # v4.17. Reported-by: James Smart <james.smart@xxxxxxxxxxxx> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> --- block/blk-mq.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 3ff3d7b49969..652d0c6d5945 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1499,12 +1499,20 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async) struct blk_mq_hw_ctx *hctx; int i; + /* + * Do not run any hardware queues if the queue is frozen or if a + * concurrent blk_cleanup_queue() call is removing any data + * structures used by this function. + */ + if (!percpu_ref_tryget(&q->q_usage_counter)) + return; queue_for_each_hw_ctx(q, hctx, i) { if (blk_mq_hctx_stopped(hctx)) continue; blk_mq_run_hw_queue(hctx, async); } + percpu_ref_put(&q->q_usage_counter); } EXPORT_SYMBOL(blk_mq_run_hw_queues); -- 2.21.0.196.g041f5ea1cf98