On 1/9/25 00:27, Christoph Hellwig wrote: > On Wed, Jan 08, 2025 at 06:31:15PM +0800, Ming Lei wrote: >>> - if (!(q->limits.features & BLK_FEAT_POLL) && >>> - (bio->bi_opf & REQ_POLLED)) { >>> + if ((bio->bi_opf & REQ_POLLED) && !bdev_can_poll(bdev)) { >> >> submit_bio_noacct() is called without grabbing .q_usage_counter, >> so tagset may be freed now, then use-after-free on q->tag_set? > > Indeed. That also means the previous check wasn't reliable either. > I think we can simple move the check into > blk_mq_submit_bio/__submit_bio which means we'll do a bunch more > checks before we eventually fail, but otherwise it'll work the > same. Given that the request queue is the same for all tag sets, I do not think we need to have the queue_limits_start_update()/commit_update() within the tag set loop in __blk_mq_update_nr_hw_queues(). So something like this should be enough for an initial fix, no ? diff --git a/block/blk-mq.c b/block/blk-mq.c index 8ac19d4ae3c0..ac71e9cee25b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4986,6 +4986,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues) { struct request_queue *q; + struct queue_limits lim; LIST_HEAD(head); int prev_nr_hw_queues = set->nr_hw_queues; int i; @@ -4999,8 +5000,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, if (set->nr_maps == 1 && nr_hw_queues == set->nr_hw_queues) return; + lim = queue_limits_start_update(q); list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_freeze_queue(q); + /* * Switch IO scheduler to 'none', cleaning up the data associated * with the previous scheduler. We will switch back once we are done @@ -5036,13 +5039,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, set->nr_hw_queues = prev_nr_hw_queues; goto fallback; } - lim = queue_limits_start_update(q); if (blk_mq_can_poll(set)) lim.features |= BLK_FEAT_POLL; else lim.features &= ~BLK_FEAT_POLL; - if (queue_limits_commit_update(q, &lim) < 0) - pr_warn("updating the poll flag failed\n"); blk_mq_map_swqueue(q); } @@ -5059,6 +5059,9 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_unfreeze_queue(q); + if (queue_limits_commit_update(q, &lim) < 0) + pr_warn("updating the poll flag failed\n"); + /* Free the excess tags when nr_hw_queues shrink. */ for (i = set->nr_hw_queues; i < prev_nr_hw_queues; i++) __blk_mq_free_map_and_rqs(set, i); With that, no modification of the hot path to check the poll feature should be needed. And I also fail to see why we need to do the queue freeze for all tag sets. Once should be enough as well... -- Damien Le Moal Western Digital Research