commit 01e99aeca397 ("blk-mq: insert passthrough request into hctx->dispatch directly") may change to add flush request to the tail of dispatch by applying the 'add_head' parameter of blk_mq_sched_insert_request. Turns out this way causes performance regression on NCQ controller because flush is non-NCQ command, which can't be queued when there is any in-flight NCQ command. When adding flush rq to the front of hctx->dispatch, it is easier to introduce extra time to flush rq's latency compared with adding to the tail of dispatch queue because of S_SCHED_RESTART, then chance of flush merge is increased, and less flush requests may be issued to controller. So always insert flush request to the front of dispatch queue just like before applying commit 01e99aeca397 ("blk-mq: insert passthrough request into hctx->dispatch directly"). Cc: Damien Le Moal <Damien.LeMoal@xxxxxxx> Cc: Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> Fixes: 01e99aeca397 ("blk-mq: insert passthrough request into hctx->dispatch directly") Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> --- block/blk-mq-sched.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 856356b1619e..74cedea56034 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -398,6 +398,28 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, WARN_ON(e && (rq->tag != -1)); if (blk_mq_sched_bypass_insert(hctx, !!e, rq)) { + /* + * Firstly normal IO request is inserted to scheduler queue or + * sw queue, meantime we add flush request to dispatch queue( + * hctx->dispatch) directly and there is at most one in-flight + * flush request for each hw queue, so it doesn't matter to add + * flush request to tail or front of the dispatch queue. + * + * Secondly in case of NCQ, flush request belongs to non-NCQ + * command, and queueing it will fail when there is any + * in-flight normal IO request(NCQ command). When adding flush + * rq to the front of hctx->dispatch, it is easier to introduce + * extra time to flush rq's latency because of S_SCHED_RESTART + * compared with adding to the tail of dispatch queue, then + * chance of flush merge is increased, and less flush requests + * will be issued to controller. It is observed that ~10% time + * is saved in blktests block/004 on disk attached to AHCI/NCQ + * drive when adding flush rq to the front of hctx->dispatch. + * + * Simply queue flush rq to the front of hctx->dispatch so that + * intensive flush workloads can benefit in case of NCQ HW. + */ + at_head = (rq->rq_flags & RQF_FLUSH_SEQ) ? true : at_head; blk_mq_request_bypass_insert(rq, at_head, false); goto run; } -- 2.20.1