On 4/24/19 1:02 PM, Ming Lei wrote:
In normal queue cleanup path, hctx is released after request queue
is freed, see blk_mq_release().
However, in __blk_mq_update_nr_hw_queues(), hctx may be freed because
of hw queues shrinking. This way is easy to cause use-after-free,
because: one implicit rule is that it is safe to call almost all block
layer APIs if the request queue is alive; and one hctx may be retrieved
by one API, then the hctx can be freed by blk_mq_update_nr_hw_queues();
finally use-after-free is triggered.
Fixes this issue by always freeing hctx after releasing request queue.
If some hctxs are removed in blk_mq_update_nr_hw_queues(), introduce
a per-queue list to hold them, then try to resuse these hctxs if numa
node is matched.
Cc: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
Cc: James Smart <james.smart@xxxxxxxxxxxx>
Cc: Bart Van Assche <bart.vanassche@xxxxxxx>
Cc: linux-scsi@xxxxxxxxxxxxxxx,
Cc: Martin K . Petersen <martin.petersen@xxxxxxxxxx>,
Cc: Christoph Hellwig <hch@xxxxxx>,
Cc: James E . J . Bottomley <jejb@xxxxxxxxxxxxxxxxxx>,
Tested-by: James Smart <james.smart@xxxxxxxxxxxx>
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
block/blk-mq.c | 46 +++++++++++++++++++++++++++++++++-------------
include/linux/blk-mq.h | 2 ++
include/linux/blkdev.h | 7 +++++++
3 files changed, 42 insertions(+), 13 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1eceeb26ae7d..b9d711d12cae 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2274,6 +2274,10 @@ static void blk_mq_exit_hctx(struct request_queue *q,
set->ops->exit_hctx(hctx, hctx_idx);
blk_mq_remove_cpuhp(hctx);
+
+ spin_lock(&q->unused_hctx_lock);
+ list_add(&hctx->hctx_list, &q->unused_hctx_list);
+ spin_unlock(&q->unused_hctx_lock);
}
static void blk_mq_exit_hw_queues(struct request_queue *q,
@@ -2362,6 +2366,8 @@ blk_mq_alloc_hctx(struct request_queue *q,
hctx->queue = q;
hctx->flags = set->flags & ~BLK_MQ_F_TAG_SHARED;
+ INIT_LIST_HEAD(&hctx->hctx_list);
+
/*
* Allocate space for all possible cpus to avoid allocation at
* runtime
@@ -2675,15 +2681,17 @@ static int blk_mq_alloc_ctxs(struct request_queue *q)
*/
void blk_mq_release(struct request_queue *q)
{
- struct blk_mq_hw_ctx *hctx;
- unsigned int i;
+ struct blk_mq_hw_ctx *hctx, *next;
+ int i;
cancel_delayed_work_sync(&q->requeue_work);
- /* hctx kobj stays in hctx */
- queue_for_each_hw_ctx(q, hctx, i) {
- if (!hctx)
- continue;
+ queue_for_each_hw_ctx(q, hctx, i)
+ WARN_ON_ONCE(hctx && list_empty(&hctx->hctx_list));
+
+ /* all hctx are in .unused_hctx_list now */
+ list_for_each_entry_safe(hctx, next, &q->unused_hctx_list, hctx_list) {
+ list_del_init(&hctx->hctx_list);
kobject_put(&hctx->kobj);
}
I would consider it a bug if the hctx is still assigned to
->queue_hw_ctx at this point.
But that's a minor issue.
Reviewed-by: Hannes Reinecke <hare@xxxxxxxx>
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@xxxxxxxx +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)