On Thu, Sep 27, 2018 at 11:30:19AM +0800, jianchao.wang wrote: > Hi Ming > > On 09/27/2018 12:08 AM, Ming Lei wrote: > > Lot of controllers may have only one irq vector for completing IO > > request. And usually affinity of the only irq vector is all possible > > CPUs, however, on most of ARCH, there may be only one specific CPU > > for handling this interrupt. > > > > So if all IOs are completed in hardirq context, it is inevitable to > > degrade IO performance because of increased irq latency. > > > > This patch tries to address this issue by allowing to complete request > > in softirq context, like the legacy IO path. > > > > IOPS is observed as ~13%+ in the following randread test on raid0 over > > virtio-scsi. > > > > mdadm --create --verbose /dev/md0 --level=0 --chunk=1024 --raid-devices=8 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi > > > > fio --time_based --name=benchmark --runtime=30 --filename=/dev/md0 --nrfiles=1 --ioengine=libaio --iodepth=32 --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=32 --rw=randread --blocksize=4k > > > > Cc: Zach Marano <zmarano@xxxxxxxxxx> > > Cc: Christoph Hellwig <hch@xxxxxx> > > Cc: Bart Van Assche <bvanassche@xxxxxxx> > > Cc: Jianchao Wang <jianchao.w.wang@xxxxxxxxxx> > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > block/blk-mq.c | 14 ++++++++++++++ > > block/blk-softirq.c | 7 +++++-- > > 2 files changed, 19 insertions(+), 2 deletions(-) > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index 85a1c1a59c72..d4792c3ac983 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -565,6 +565,20 @@ static void __blk_mq_complete_request(struct request *rq) > > if (rq->internal_tag != -1) > > blk_mq_sched_completed_request(rq); > > > > + /* > > + * Most of single queue controllers, there is only one irq vector > > + * for handling IO completion, and the only irq's affinity is set > > + * as all possible CPUs. On most of ARCHs, this affinity means the > > + * irq is handled on one specific CPU. > > + * > > + * So complete IO reqeust in softirq context in case of single queue > > + * for not degrading IO performance by irqsoff latency. > > + */ > > + if (rq->q->nr_hw_queues == 1) { > > + __blk_complete_request(rq); > > + return; > > + } > > + > > if (!test_bit(QUEUE_FLAG_SAME_COMP, &rq->q->queue_flags)) { > > rq->q->softirq_done_fn(rq); > > return; > > diff --git a/block/blk-softirq.c b/block/blk-softirq.c > > index 15c1f5e12eb8..b1df9b6c1731 100644 > > --- a/block/blk-softirq.c > > +++ b/block/blk-softirq.c > > @@ -101,17 +101,20 @@ void __blk_complete_request(struct request *req) > > struct request_queue *q = req->q; > > unsigned long flags; > > bool shared = false; > > + int rq_cpu; > > > > BUG_ON(!q->softirq_done_fn); > > > > + rq_cpu = q->mq_ops ? req->mq_ctx->cpu : req->cpu; > > + > > local_irq_save(flags); > > cpu = smp_processor_id(); > > > > /* > > * Select completion CPU > > */ > > - if (req->cpu != -1) { > > - ccpu = req->cpu; > > + if (rq_cpu != -1) { > > + ccpu = q->mq_ops ? req->mq_ctx->cpu : req->cpu; > > if (!test_bit(QUEUE_FLAG_SAME_FORCE, &q->queue_flags)) > > shared = cpus_share_cache(cpu, ccpu); > > } else > > > ; > > Only QUEUE_FLAG_SAME_COMP is set, we will try to complete the request on the cpu > where it is allocated. > > For the blk-legacy, the req->cpu is -1 when QUEUE_FLAG_SAME_COMP is not set. > > blk_queue_bio > > if (test_bit(QUEUE_FLAG_SAME_COMP, &q->queue_flags)) > req->cpu = raw_smp_processor_id() > > But for the blk-mq, the req->mq_ctx->cpu must be valid. > So we should check the QUEUE_FLAG_SAME_COMP for the blk-mq case. Good catch, will fix it in V2. > > On the other hand, how about split the softirq completion part out of > __blk_complete_request ? It is confused when find __blk_complete_request > in __blk_mq_complete_request. __blk_complete_request() is just a common helper between blk-mq and legacy path. We have such usages already, so looks it is fine. Thanks, Ming