On Thu, Feb 27, 2014 at 12:22:54PM -0500, Matthew Wilcox wrote: > On Wed, Feb 26, 2014 at 03:39:49PM -0800, Kent Overstreet wrote: > > We do this by adding calls to blk_queue_split() to the various > > make_request functions that need it - a few can already handle arbitrary > > size bios. Note that we add the call _after_ any call to blk_queue_bounce(); > > this means that blk_queue_split() and blk_recalc_rq_segments() don't need to > > be concerned with bouncing affecting segment merging. > > > diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c > > index 51824d1f23..e4376b9613 100644 > > --- a/drivers/block/nvme-core.c > > +++ b/drivers/block/nvme-core.c > > @@ -737,6 +737,8 @@ static void nvme_make_request(struct request_queue *q, struct bio *bio) > > struct nvme_queue *nvmeq = get_nvmeq(ns->dev); > > int result = -EBUSY; > > > > + blk_queue_split(q, &bio, q->bio_split); > > + > > if (!nvmeq) { > > put_nvmeq(NULL); > > bio_endio(bio, -EIO); > > I'd suggest that we do: > > - struct nvme_queue *nvmeq = get_nvmeq(ns->dev); > + struct nvme_queue *nvmeq; > int result = -EBUSY; > > + blk_queue_split(q, &bio, q->bio_split); > + > + nvmeq = get_nvmeq(ns->dev); > if (!nvmeq) { > > so that we're running the blk_queue_split() code outside the get_cpu() > call. > > Now, the NVMe driver has its own rules about when BIOs have to be split. > Right now, that's way down inside the nvme_map_bio() call when we're > walking the bio to compose the scatterlist. Should we instead have an > nvme_bio_split() routine that is called instead of blk_queue_split(), > and we can simplify nvme_map_bio() since it'll know that it's working > with bios that don't have to be split. > > In fact, I think it would have little NVMe-specific in it at that point, > so we could name __blk_bios_map_sg() better, export it to drivers and > call it from nvme_map_bio(), which I think would make everybody happier. Actually, reading nvme_map_bio() (it's different since last I looked at it) it looks like nvme should already be able to handle arbitrary size bios? I do intend to rework the blk_bio_map_sg() (or add a new one?) to incrementally map as much of a bio as will fit in the provided scatterlist, but it looks like nvme has some odd restrictions where it's using BIOVEC_PHYS_MERGABLE()/BIOVEC_NOT_VIRT_MERGABLE() so I dunno if it's worth bothering to try and have it use generic code. However we don't need an explicit split here: if the sg fills up (i.e. the places nvme_split_and_submit() is called), we can just mark the bio as partially completed (set bio->bi_iter = iter, i.e. use the iterator you passed to bio_for_each_segment), then increment bi_remaining (which just counts completions, i.e. bio_endio() calls before the bio is really completed) and resubmit the original bio. No need to allocate a split bio, or loop over the bio again in bio_split(). -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel