On Wed, Nov 28, 2018 at 07:19:09PM -0700, Jens Axboe wrote: > On 11/28/18 6:23 PM, Ming Lei wrote: > > On Tue, Nov 27, 2018 at 07:34:51PM -0700, Jens Axboe wrote: > >> On 11/27/18 7:10 PM, Ming Lei wrote: > >>> On Mon, Nov 26, 2018 at 09:35:53AM -0700, Jens Axboe wrote: > >>>> We need this for blk-mq to kick things into gear, if we told it that > >>>> we had more IO coming, but then failed to deliver on that promise. > >>>> > >>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > >>>> --- > >>>> drivers/block/virtio_blk.c | 15 +++++++++++++++ > >>>> 1 file changed, 15 insertions(+) > >>>> > >>>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > >>>> index 6e869d05f91e..b49c57e77780 100644 > >>>> --- a/drivers/block/virtio_blk.c > >>>> +++ b/drivers/block/virtio_blk.c > >>>> @@ -214,6 +214,20 @@ static void virtblk_done(struct virtqueue *vq) > >>>> spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); > >>>> } > >>>> > >>>> +static void virtio_commit_rqs(struct blk_mq_hw_ctx *hctx) > >>>> +{ > >>>> + struct virtio_blk *vblk = hctx->queue->queuedata; > >>>> + int qid = hctx->queue_num; > >>>> + bool kick; > >>>> + > >>>> + spin_lock_irq(&vblk->vqs[qid].lock); > >>>> + kick = virtqueue_kick_prepare(vblk->vqs[qid].vq); > >>>> + spin_unlock_irq(&vblk->vqs[qid].lock); > >>>> + > >>>> + if (kick) > >>>> + virtqueue_notify(vblk->vqs[qid].vq); > >>>> +} > >>>> + > >>>> static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, > >>>> const struct blk_mq_queue_data *bd) > >>>> { > >>>> @@ -638,6 +652,7 @@ static void virtblk_initialize_rq(struct request *req) > >>>> > >>>> static const struct blk_mq_ops virtio_mq_ops = { > >>>> .queue_rq = virtio_queue_rq, > >>>> + .commit_rqs = virtio_commit_rqs, > >>>> .complete = virtblk_request_done, > >>>> .init_request = virtblk_init_request, > >>>> #ifdef CONFIG_VIRTIO_BLK_SCSI > >>>> -- > >>>> 2.17.1 > >>>> > >>> > >>> If .commit_rqs() is implemented, virtqueue_notify() in virtio_queue_rq() > >>> should have been removed for saving the world switch per .queue_rq() > >> > >> ->commits_rqs() is only for the case where bd->last is set to false, > >> and we never make it to the end and flag bd->last == true. If bd->last > >> is true, the driver should kick things into gear. > > > > OK, looks I misunderstood it. However, virtio-blk doesn't need this > > change since virtio_queue_rq() can handle it well. This patch may introduce > > one unnecessary VM world switch in case of queue busy. > > Not it won't, it may in the case of some failure outside of the driver. If the failure is because of out of tag, blk_mq_dispatch_wake() will rerun the queue, and the bd->last will be set finally. Or is there other failure(outside of driver) not covered? > The only reason that virtio-blk doesn't currently hang is because it > has restart logic, and the failure case only happens in the if we > already have IO in-flight. Yeah, virtqueue_kick() is called in case of any error in virtio_queue_rq(), so I am still wondering why we have to implement .commit_rqs() for virtio-blk. > For the NVMe variant, that's not going to be the case. OK. > > > IMO bd->last won't work well in case of io scheduler given the rq_list > > only includes one single request. > > But that's a fake limitation that definitely should just be lifted, > the fact that blk-mq-sched is _currently_ just doing single requests > is woefully inefficient. I agree, but seems a bit hard given we have to consider request merge. > > > I wrote this kind of patch(never posted) before to use sort of > > ->commits_rqs() to replace the current bd->last mechanism which need > > one extra driver tag, which may improve the above case, also code gets > > cleaned up. > > It doesn't need one extra driver tag, we currently get an extra one just > to flag ->last correctly. That's not a requirement, that's a limitation > of the current implementation. We could get rid of that, and it it > proves to be an issue, that's not hard to do. What do you think about using .commit_rqs() to replace ->last? For example, just call .commit_rqs() after the last request is queued to driver successfully. Then we can remove bd->last and avoid to get the extra tag for figuring out bd->last. Thanks, Ming