On Tue, Aug 30, 2016 at 5:57 AM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > On Mon, 29 Aug 2016, Ming Lei wrote: > >> On Sat, Aug 27, 2016 at 11:09 PM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: >> > >> > >> > On Fri, 26 Aug 2016, Mike Snitzer wrote: >> > >> >> On Thu, Aug 25 2016 at 4:13pm -0400, >> >> Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> >> >> > On 08/25/2016 12:34 PM, Mikulas Patocka wrote: >> >> > > >> >> > >Device mapper can't split the bio in generic_make_request - it frees the >> >> > >md->queue->bio_split bioset, to save one kernel thread per device. Device >> >> > >mapper uses its own splitting mechanism. >> >> > > >> >> > >So what is the final decision? - should device mapper split the big bio or >> >> > >should bcache not submit big bios? >> >> > > >> >> > >I think splitting big bios in the device mapper is better - simply because >> >> > >it is much less code than reworking bcache to split bios internally. >> >> > > >> >> > >BTW. In the device mapper, we have a layer dm-io, that was created to work >> >> > >around bio size limitations - it accepts unlimited I/O request and splits >> >> > >it to several bios. When bio size limitations are gone, we could simplify >> >> > >dm-io too. >> >> > >> >> > The patch from Ming Lei was applied for 4.8 the other day. >> >> >> >> See linux-block commit: >> >> 4d70dca4eadf2f block: make sure a big bio is split into at most 256 bvecs >> >> http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=4d70dca4eadf2f95abe389116ac02b8439c2d16c >> > >> > But this patch won't work for device mapper, blk_bio_segment_split is >> > called from blk_queue_split and device mapper doesn't use blk_queue_split >> > (it can't because it frees queue->bio_split). >> > >> > We still need these two patches: >> > https://www.redhat.com/archives/dm-devel/2016-May/msg00211.html >> > https://www.redhat.com/archives/dm-devel/2016-May/msg00210.html >> >> About the 2nd patch, it might not be good enough because in theory >> a small size bio still may include big bvecs, such as, each bvec points >> to 512byte buffer, so strictly speaking the bvec number should >> be checked instead of bio size. >> >> Ming Lei > > This is not a problem. I meant the following code in your 2nd patch: + if (unlikely(bio->bi_iter.bi_size > BIO_MAX_SIZE) && + (bio->bi_rw & (REQ_FLUSH | REQ_DISCARD | REQ_WRITE)) == REQ_WRITE) + dm_accept_partial_bio(bio, BIO_MAX_SIZE >> SECTOR_SHIFT); And the check on .bi_size may not work. > > dm-crypt allocates new pages for the bio and copies (and encrypts) the > data from the original location to the new pages. > > So yes, the original bio may have a long vector with many small fragments, > but the newly allocated memory will be allocated in full pages and the > outgoing bio's vector will have full pages. > > Mikulas -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html