On Tue, Jan 16 2018 at 12:20pm -0500, Jens Axboe <axboe@xxxxxxxxx> wrote: > On 1/16/18 8:01 AM, Mike Snitzer wrote: > > From: Ming Lei <ming.lei@xxxxxxxxxx> > > > > blk_insert_cloned_request() is called in fast path of dm-rq driver, and > > in this function we append request to hctx->dispatch_list of the underlying > > queue directly. > > > > 1) This way isn't efficient enough because hctx lock is always required > > > > 2) With blk_insert_cloned_request(), we bypass underlying queue's IO > > scheduler totally, and depend on DM rq driver to do IO schedule > > completely. But DM rq driver can't get underlying queue's dispatch > > feedback at all, and this information is extreamly useful for IO merge. > > Without that IO merge can't be done basically by blk-mq, which causes > > very bad sequential IO performance. > > > > Fix this by having blk_insert_cloned_request() make use of > > blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). > > blk_mq_request_direct_issue() allows a request to be dispatched to be > > issue directly to the underlying queue and provides dispatch result to > > dm-rq and blk-mq. > > > > With this, the DM's blk-mq sequential IO performance is vastly > > improved (as much as 3X in mpath/virtio-scsi testing). > > This still feels pretty hacky... > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index 55f3a27fb2e6..3168a13cb012 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > blk_qc_t new_cookie; > > blk_status_t ret = BLK_STS_OK; > > bool run_queue = true; > > + /* > > + * If @cookie is NULL do not insert the request, this mode is used > > + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() > > + */ > > + bool dispatch_only = !cookie; > > + bool need_insert = false; > > Overloading 'cookie' to also mean this isn't very future proof or solid. It enables the existing interface to be used without needing to prop up something else that extends out to the edge (blk_insert_cloned_request). > And now __blk_mq_try_issue_directly() is pretty much a mess. Feels like > it should be split in two, where the other half would do the actual > insert. Then let the caller do it, if we could not issue directly. That > would be a lot more solid and easier to read. That is effectively what Ming's variant did (by splitting out the issue to a helper). BUT I'll see what I can come up with... (Ming please stand down until you hear back from me ;) Thanks, Mike