On Wed, Nov 17, 2021 at 12:48:14PM -0800, Keith Busch wrote: > On Wed, Nov 17, 2021 at 08:43:02AM -0700, Jens Axboe wrote: > > On 11/17/21 1:20 AM, Ming Lei wrote: > > > On Tue, Nov 16, 2021 at 08:38:04PM -0700, Jens Axboe wrote: > > >> If we have a list of requests in our plug list, send it to the driver in > > >> one go, if possible. The driver must set mq_ops->queue_rqs() to support > > >> this, if not the usual one-by-one path is used. > > >> > > >> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > > >> --- > > >> block/blk-mq.c | 17 +++++++++++++++++ > > >> include/linux/blk-mq.h | 8 ++++++++ > > >> 2 files changed, 25 insertions(+) > > >> > > >> diff --git a/block/blk-mq.c b/block/blk-mq.c > > >> index 9b4e79e2ac1e..005715206b16 100644 > > >> --- a/block/blk-mq.c > > >> +++ b/block/blk-mq.c > > >> @@ -2208,6 +2208,19 @@ static void blk_mq_plug_issue_direct(struct blk_plug *plug, bool from_schedule) > > >> int queued = 0; > > >> int errors = 0; > > >> > > >> + /* > > >> + * Peek first request and see if we have a ->queue_rqs() hook. If we > > >> + * do, we can dispatch the whole plug list in one go. We already know > > >> + * at this point that all requests belong to the same queue, caller > > >> + * must ensure that's the case. > > >> + */ > > >> + rq = rq_list_peek(&plug->mq_list); > > >> + if (rq->q->mq_ops->queue_rqs) { > > >> + rq->q->mq_ops->queue_rqs(&plug->mq_list); > > >> + if (rq_list_empty(plug->mq_list)) > > >> + return; > > >> + } > > >> + > > > > > > Then BLK_MQ_F_TAG_QUEUE_SHARED isn't handled as before for multiple NVMe > > > NS. > > > > Can you expand? If we have multiple namespaces in the plug list, we have > > multiple queues. There's no direct issue of the list if that's the case. > > Or maybe I'm missing what you mean here? > > If the plug list only has requests for one namespace, I think Ming is > referring to the special accounting for BLK_MQ_F_TAG_QUEUE_SHARED in > __blk_mq_get_driver_tag() that normally gets called before dispatching > to the driver, but isn't getting called when using .queue_rqs(). Yeah, that is it. This is one normal case, each task runs I/O on different namespace, but all NSs share/contend the single host tags, BLK_MQ_F_TAG_QUEUE_SHARED is supposed to provide fair allocation among all these NSs. thanks, Ming