On Mon, Feb 23 2015 at 5:14pm -0500, Benjamin Marzinski <bmarzins@xxxxxxxxxx> wrote: > On Mon, Feb 23, 2015 at 05:46:37PM -0500, Mike Snitzer wrote: > > > > It is blk_queue_bio(), via q->make_request_fn, that is intended to > > actually do the merging. What I'm hearing is that we're only getting > > some small amount of merging if: > > 1) the 2 path case is used and therefore ->busy hook within > > q->request_fn is not taking the request off the queue, so there is > > more potential for later merging > > 2) the 4 path case IFF nr_requests is reduced to induce ->busy, which > > only promoted merging as a side-effect like 1) above > > > > The reality is we aren't getting merging where it _should_ be happening > > (in blk_queue_bio). We need to understand why that is. > > Huh? I'm confused. If the merges that are happening (which are more > likely if either of those two points you mentioned are true) aren't > happening in blk_queue_bio, then where are they happening? AFAICT, purely from this discussion and NetApp's BZ, the little merging that is seen is happening by the ->lld_busy_fn hook. See the comment block above blk_lld_busy(). > I thought that the issue is that requests are getting pulled off the > multipath device's request queue and placed on the underlying device's > request queue too quickly, so that there are no requests on multipth's > queue to merge with when blk_queue_bio() is called. In this case, one > solution would involve keeping multipath from removing these requests > too quickly when we think that it is likely that another request which > can get merged will be added soon. That's what all my ideas have been > about. > > Do you think something different is happening here? Requests are being pulled from the DM-multipath's queue if ->lld_busy_fn() is false. Too quickly is all relative. The case NetApp reported is with SSD devices in the backend. Any increased idling in the interest of merging could hurt latency; but the merging may improve IOPS. So it is trade-off. So what I said before and am still saying is: we need to understand why the designed hook for merging, via q->make_request_fn's blk_queue_bio(), isn't actually meaningful for DM multipath. Merging should happen _before_ q->request_fn() is called. Not as a side-effect of q->request_fn() happening to have intelligence to not start the request because the underlying device queues are busy. -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel