On 3/22/22 7:27 PM, Ming Lei wrote: > On Mon, Mar 21, 2022 at 12:32:08PM +0530, Kanchan Joshi wrote: >> On Mon, Mar 14, 2022 at 10:40:53PM +0800, Ming Lei wrote: >>> On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: >>>> On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@xxxxxx> wrote: >>>>> >>>>> On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: >>>>>> From: Jens Axboe <axboe@xxxxxxxxx> >>>>>> >>>>>> Add support to use plugging if it is enabled, else use default path. >>>>> >>>>> The subject and this comment don't really explain what is done, and >>>>> also don't mention at all why it is done. >>>> >>>> Missed out, will fix up. But plugging gave a very good hike to IOPS. >>> >>> But how does plugging improve IOPS here for passthrough request? Not >>> see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), >>> which is called by nvme_submit_user_cmd(). >> >> Yes, one tag at a time for each request, but none of the request gets >> dispatched and instead added to the plug. And when io_uring ends the >> plug, the whole batch gets dispatched via ->queue_rqs (otherwise it used >> to be via ->queue_rq, one request at a time). >> >> Only .plug impact looks like this on passthru-randread: >> >> KIOPS(depth_batch) 1_1 8_2 64_16 128_32 >> Without plug 159 496 784 785 >> With plug 159 525 991 1044 >> >> Hope it does clarify. > > OK, thanks for your confirmation, then the improvement should be from > batch submission only. > > If cached request is enabled, I guess the number could be better. Yes, my original test patch pre-dates being able to set a submit count, it would definitely help improve this case too. The current win is indeed just from being able to use ->queue_rqs() rather than single submit. -- Jens Axboe