In [0], CPU usage for blk_mq_queue_tag_busy_iter() was optimized, but there are still periodic call of blk_mq_queue_tag_busy_iter() from below context. Below context is used for block layer timer to find out potential expired command (per request queue) which requires tag iteration almost every 5 seconds(defined BLK_MAX_TIMEOUT) for each request queue. kthread worker_thread process_one_work blk_mq_timeout_work blk_mq_queue_tag_busy_iter bt_iter blk_mq_find_and_get_req _raw_spin_lock_irqsave native_queued_spin_lock_slowpath Changes in this patch optimize extra iterations of tags in case of shared_tags. One iteration of shared_tags can give expected results for iterate function. Setup - AMD64 Gen-4.0 Server. 64 Virtual Drive created using 16 Nvme drives + mpi3mr driver (in shared_tags mode) Test command - fio 64.fio --rw=randread --bs=4K --iodepth=32 --numjobs=2 --ioscheduler=mq-deadline --disk_util=0 Without this patch on 5.16.0-rc5, mpi3mr driver in shared_tags mode can give 4.0M IOPs vs expected to get ~6.0M. Snippet of perf top 25.42% [kernel] [k] native_queued_spin_lock_slowpath 3.95% [kernel] [k] cpupri_set 2.05% [kernel] [k] __blk_mq_get_driver_tag 1.67% [kernel] [k] __rcu_read_unlock 1.63% [kernel] [k] check_preemption_disabled After applying this patch on 5.16.0-rc5, mpi3mr driver in shared_tags mode reach up to 5.8M IOPs. Snippet of perf top 7.95% [kernel] [k] native_queued_spin_lock_slowpath 5.61% [kernel] [k] cpupri_set 2.98% [kernel] [k] acpi_processor_ffh_cstate_enter 2.49% [kernel] [k] read_tsc 2.15% [kernel] [k] check_preemption_disabled [0] https://lore.kernel.org/all/9b092ca49e9b5415772cd950a3c12584@xxxxxxxxxxxxxx/ Cc: linux-block@xxxxxxxxxxxxxxx Cc: linux-kernel@xxxxxxxxxxxxxxx Cc: john.garry@xxxxxxxxxx Cc: ming.lei@xxxxxxxxxx Cc: sathya.prakash@xxxxxxxxxxxx Signed-off-by: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx> --- block/blk-mq-tag.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 995336abee33..3e0a8e79f966 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -253,7 +253,8 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) if (!rq) return true; - if (rq->q == hctx->queue && rq->mq_hctx == hctx) + if (rq->q == hctx->queue && (rq->mq_hctx == hctx || + blk_mq_is_shared_tags(hctx->flags))) ret = iter_data->fn(hctx, rq, iter_data->data, reserved); blk_mq_put_rq_ref(rq); return ret; @@ -484,6 +485,14 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn, if (tags->nr_reserved_tags) bt_for_each(hctx, &tags->breserved_tags, fn, priv, true); bt_for_each(hctx, &tags->bitmap_tags, fn, priv, false); + + /* In case of shared bitmap if shared_tags is allocated, it is not required + * to iterate all the hctx. Looping one hctx is good enough. + */ + if (blk_mq_is_shared_tags(hctx->flags)) { + blk_queue_exit(q); + return; + } } blk_queue_exit(q); } -- 2.18.1
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature