On 2019/09/19 17:48, Kashyap Desai wrote: >>>> - } else if (plug && (q->nr_hw_queues == 1 || q->mq_ops- >>> commit_rqs)) { >>>> + } else if (plug && q->mq_ops->commit_rqs) { >>>> /* >>>> * Use plugging if we have a ->commit_rqs() hook as well, > as >>>> * we know the driver uses bd->last in a smart fashion. >>>> @@ -2020,9 +2019,6 @@ static blk_qc_t blk_mq_make_request(struct >> request_queue *q, struct bio *bio) >>>> blk_mq_try_issue_directly(data.hctx, > same_queue_rq, >>>> &cookie); >>>> } >>>> - } else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator && >>>> - !data.hctx->dispatch_busy)) { >>>> - blk_mq_try_issue_directly(data.hctx, rq, &cookie); > Hannes - > > Earlier check prior to "commit 6ce3dd6eec114930cf2035a8bcb1e80477ed79a8" > was only (q->nr_hw_queues > 1 && is_sync). > I am not sure if check of nr_hw_queues are required or not at this place, > but other part of check (!q->elevator && !data.hctx->dispatch_busy) to > qualify for direct dispatch is required for higher performance. > > Recent MegaRaid and MPT HBA Aero series controller is capable of doing > ~3.0 M IOPs and for such high performance using single hardware queue, > commit 6ce3dd6eec114930cf2035a8bcb1e80477ed79a8 is very important. Kashyap, Ming, Thanks for the information. We will restore this case. > > Kashyap > > >>> >>> It may be worth mentioning that blk_mq_sched_insert_request() will do >>> a direct insert of the request using __blk_mq_insert_request(). But >>> that insert is slightly different from what >>> blk_mq_try_issue_directly() does with >>> __blk_mq_issue_directly() as the request in that case is passed along >>> to the device using queue->mq_ops->queue_rq() while >>> __blk_mq_insert_request() will put the request in ctx->rq_lists[type]. >>> >>> This removes the optimized case !q->elevator && >>> !data.hctx->dispatch_busy, but I am not sure of the actual performance >>> impact yet. We may want to patch >>> blk_mq_sched_insert_request() to handle that case. >> >> The optimization did improve IOPS of single queue SCSI SSD a lot, see >> >> commit 6ce3dd6eec114930cf2035a8bcb1e80477ed79a8 >> Author: Ming Lei <ming.lei@xxxxxxxxxx> >> Date: Tue Jul 10 09:03:31 2018 +0800 >> >> blk-mq: issue directly if hw queue isn't busy in case of 'none' >> >> In case of 'none' io scheduler, when hw queue isn't busy, it isn't >> necessary to enqueue request to sw queue and dequeue it from >> sw queue because request may be submitted to hw queue asap without >> extra cost, meantime there shouldn't be much request in sw queue, >> and we don't need to worry about effect on IO merge. >> >> There are still some single hw queue SCSI HBAs(HPSA, megaraid_sas, > ...) >> which may connect high performance devices, so 'none' is often > required >> for obtaining good performance. >> >> This patch improves IOPS and decreases CPU unilization on > megaraid_sas, >> per Kashyap's test. >> >> >> Thanks, >> Ming > -- Damien Le Moal Western Digital Research