On 09/23/2014 07:03 PM, Keith Busch wrote: > I'm working with multipathing nvme devices using the blk-mq version of > the nvme driver, but dm-mpath only works with the older request based > drivers. This patch proposes to enable dm-mpath to work with both types > of request queues and is succesfull with my dual ported nvme drives. > > I think there may still be fix ups to do around submission side error > handling, but I think it's at a decent stopping point to solicit feedback > before I pursue taking it further. I hear there may be some resistance > to add blk-mq support to dm-mpath anyway, but it seems too easy to add > support to not at least try. :) > > To work, this has dm allocate requests from the request_queue for > the device-mapper type rather than allocate one on its own, so the > cloned request is properly allocated and initialized for the device's > request_queue. The original request's 'special' now points to the > dm_rq_target_io rather than at the cloned request because the clone > is allocated later by the block layer rather than by dm, and then all > the other back referencing to the original seems to work out. The block > layer then inserts the cloned reqest using the appropriate function for > the request_queue type rather than just calling q->request_fn(). > > Compile tested on 3.17-rc6; runtime teseted on Matias Bjorling's > linux-collab nvmemq_review using 3.16. > The resistance wasn't so much for enabling multipath for block-mq, it was _how_ multipath should be modelled on top of block-mq. With a simple enabling we actually have two layers of I/O scheduling; once in multipathing to select between the individual queues, and once in block-mq to select the correct hardware context. So we end up with a four-tiered hierarchy: m priority groups - n pg_paths/request_queues -> o cpus -> p hctx Giving us a full m * n * p (hctx are tagged per cpu) variety where the I/Os might be send. Performance wise it might be beneficial to tag a hardware context to a given path, effectively removing I/O scheduling from block-mq. But this would require some substantial update to the current blk-mq design (blocked paths, dynamic reconfiguration). However, this looks like a good starting point. I'll give it a go and see how far I'll be getting with it. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel