On Sun, 2017-08-27 at 00:33 +0800, Ming Lei wrote: > During dispatching, we moved all requests from hctx->dispatch to > one temporary list, then dispatch them one by one from this list. > Unfortunately duirng this period, run queue from other contexts ^^^^^^ during? > may think the queue is idle, then start to dequeue from sw/scheduler > queue and still try to dispatch because ->dispatch is empty. This way > hurts sequential I/O performance because requests are dequeued when > lld queue is busy. > [ ... ] > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 735e432294ab..4d7bea8c2594 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -146,7 +146,6 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) > struct request_queue *q = hctx->queue; > struct elevator_queue *e = q->elevator; > const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request; > - bool do_sched_dispatch = true; > LIST_HEAD(rq_list); > > /* RCU or SRCU read lock is needed before checking quiesced flag */ Shouldn't blk_mq_sched_dispatch_requests() set BLK_MQ_S_DISPATCH_BUSY just after the following statement because this statement makes the dispatch list empty? list_splice_init(&hctx->dispatch, &rq_list); > @@ -177,8 +176,33 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) > */ > if (!list_empty(&rq_list)) { > blk_mq_sched_mark_restart_hctx(hctx); > - do_sched_dispatch = blk_mq_dispatch_rq_list(q, &rq_list); > - } else if (!has_sched_dispatch && !q->queue_depth) { > + blk_mq_dispatch_rq_list(q, &rq_list); > + > + /* > + * We may clear DISPATCH_BUSY just after it > + * is set from another context, the only cost > + * is that one request is dequeued a bit early, > + * we can survive that. Given the window is > + * too small, no need to worry about performance ^^^ The word "too" seems extraneous to me in this sentence. > bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio, > @@ -330,6 +353,7 @@ static bool blk_mq_sched_bypass_insert(struct blk_mq_hw_ctx *hctx, > */ > spin_lock(&hctx->lock); > list_add(&rq->queuelist, &hctx->dispatch); > + set_bit(BLK_MQ_S_DISPATCH_BUSY, &hctx->state); > spin_unlock(&hctx->lock); > return true; > } Is it necessary to make blk_mq_sched_bypass_insert() set BLK_MQ_S_DISPATCH_BUSY? My understanding is that only code that makes the dispatch list empty should set BLK_MQ_S_DISPATCH_BUSY. However, blk_mq_sched_bypass_insert() adds an element to the dispatch list so that guarantees that that list is not empty. > diff --git a/block/blk-mq.c b/block/blk-mq.c > index f063dd0f197f..6af56a71c1cd 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1140,6 +1140,11 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list) > > spin_lock(&hctx->lock); > list_splice_init(list, &hctx->dispatch); > + /* > + * DISPATCH_BUSY won't be cleared until all requests > + * in hctx->dispatch are dispatched successfully > + */ > + set_bit(BLK_MQ_S_DISPATCH_BUSY, &hctx->state); > spin_unlock(&hctx->lock); Same comment here - since this code adds one or more requests to the dispatch list, is it really needed to set the DISPATCH_BUSY flag? Bart.