Re: dm-mpath: Work with blk multi-queue drivers

Mike Snitzer <snitzer@xxxxxxxxxx> · Wed, 24 Sep 2014 14:34:54 -0400

On Tue, Sep 23 2014 at  1:03pm -0400,
Keith Busch <keith.busch@xxxxxxxxx> wrote:

> I hear there may be some resistance to add blk-mq support to dm-mpath
> anyway, but it seems too easy to add support to not at least try. :)

Why?  Not sure who you heard that from but I'm not opposed to it if it
is done properly.

On Wed, Sep 24 2014 at  1:20pm -0400,
Keith Busch <keith.busch@xxxxxxxxx> wrote:

> On Wed, 24 Sep 2014, Christoph Hellwig wrote:
> >>index bf930f4..4c5952b 100644
> >>--- a/block/blk-core.c
> >>+++ b/block/blk-core.c
> >
... 
> >> 	unsigned long flags;
> >> 	struct pgpath *pgpath;
> >> 	struct block_device *bdev;
> >>@@ -410,9 +411,11 @@ static int multipath_map(struct dm_target *ti, struct request *clone,
> >> 		goto out_unlock;
> >>
> >> 	bdev = pgpath->path.dev->bdev;
> >>-	clone->q = bdev_get_queue(bdev);
> >>-	clone->rq_disk = bdev->bd_disk;
> >>-	clone->cmd_flags |= REQ_FAILFAST_TRANSPORT;
> >>+	*clone = blk_get_request(bdev_get_queue(bdev), rq_data_dir(rq),
> >>+							GFP_KERNEL);
> >
> >I suspect this should be GFP_NOWAIT so that we can push the request back
> >up the stack if none are available.
> 
> Aha, nice catch. I'd have hit the BUG_ON() in dm_request_fn() the way
> I wrote this since irq's could have been re-enabled if a request wasn't
> immediately available.
> 
> Thanks for the feedback!

We have never allowed new memory allocation in a DM target's .map or
.map_rq method.  The only allocations that are allowed are those backed
by a mempool (see md->io_pool usage in alloc_rq_tio, before the clone's
struct request was embedded in struct dm_rq_target_io).  This was done
to allow request-based DM devices to be stacked on each other.  Even
though nothing ever made use of stacking request-based DM devices; doing
so is still a DM design possibility.  Could be we can simply say "kill
request-based DM device stacking, it isn't needed".. if we went that far
then the per-rq-based-device mempool becomes much less important.

But building on this concern, historically each request-based DM device
has maintained a reserve for cloning a request (so that forward progress
can be ensured even if system memory is unavailable).  So AFAICT
Christoph's suggestion to switch to GFP_NOWAIT from GFP_KERNEL still
doesn't do enough to preserve the higher standards rq-based DM had
established for guaranteed ability to clone a request.

Taking a step back, I'm really _not_ liking the duality of the DM core
with regard to the cloning of a request.  Where a request-based DM
target (e.g. mpath) is tasked with calling blk_get_request() but then DM
core is left to manage the life-cycle of that clone that the target
allocated.  If anything a new dm_get_request() wrapper should be added
to DM core, subtle difference but then at least one can easily see
balanced request management within the DM core?

In general I'd really like to see a bit more care taken to improve the
block interface that DM is using for request-based DM.  What I mean is,
Hannes proposed a series that eliminated request-based DM's cloning of
requests (and associated bios), see thread:
http://www.redhat.com/archives/dm-devel/2014-June/msg00023.html

(that series was just a "test balloon" as Hannes put it, but it offered
some serious "cleanup" by nuking a lot of fiddly block layer code in DM
core)

I never did take the time to properly review Hannes' proposal but now
that you're floating this blk-mq support for DM core (and DM mpath) I'm
clearly going to have to take this all on in a much more focused way.

Christoph/Hannes/Junichi/Keith/others, can you see a way forward that
offers a lighter request-based DM that makes required callouts to (new?)
block interfaces that helps us abstract the old request and blk-mq
request allocation, etc?

Mike

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel