On 9/23/22 9:13 AM, Pankaj Raghav wrote: > On 2022-09-23 16:52, Pankaj Raghav wrote: >> On Thu, Sep 22, 2022 at 12:28:01PM -0600, Jens Axboe wrote: >>> The filesystem IO path can take advantage of allocating batches of >>> requests, if the underlying submitter tells the block layer about it >>> through the blk_plug. For passthrough IO, the exported API is the >>> blk_mq_alloc_request() helper, and that one does not allow for >>> request caching. >>> >>> Wire up request caching for blk_mq_alloc_request(), which is generally >>> done without having a bio available upfront. >>> >>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >>> --- >>> block/blk-mq.c | 80 ++++++++++++++++++++++++++++++++++++++++++++------ >>> 1 file changed, 71 insertions(+), 9 deletions(-) >>> >> I think we need this patch to ensure correct behaviour for passthrough: >> >> diff --git a/block/blk-mq.c b/block/blk-mq.c >> index c11949d66163..840541c1ab40 100644 >> --- a/block/blk-mq.c >> +++ b/block/blk-mq.c >> @@ -1213,7 +1213,7 @@ void blk_execute_rq_nowait(struct request *rq, bool at_head) >> WARN_ON(!blk_rq_is_passthrough(rq)); >> >> blk_account_io_start(rq); >> - if (current->plug) >> + if (blk_mq_plug(rq->bio)) >> blk_add_rq_to_plug(current->plug, rq); >> else >> blk_mq_sched_insert_request(rq, at_head, true, false); >> >> As the passthrough path can now support request caching via blk_mq_alloc_request(), >> and it uses blk_execute_rq_nowait(), bad things can happen at least for zoned >> devices: >> >> static inline struct blk_plug *blk_mq_plug( struct bio *bio) >> { >> /* Zoned block device write operation case: do not plug the BIO */ >> if (bdev_is_zoned(bio->bi_bdev) && op_is_write(bio_op(bio))) >> return NULL; >> .. > > Thinking more about it, even this will not fix it because op is > REQ_OP_DRV_OUT if it is a NVMe write for passthrough requests. > > @Damien Should the condition in blk_mq_plug() be changed to: > > static inline struct blk_plug *blk_mq_plug( struct bio *bio) > { > /* Zoned block device write operation case: do not plug the BIO */ > if (bdev_is_zoned(bio->bi_bdev) && !op_is_read(bio_op(bio))) > return NULL; That looks reasonable to me. It'll prevent plug optimizations even for passthrough on zoned devices, but that's probably fine. -- Jens Axboe