On Mon, 2016-02-29 at 00:59 +0800, Ming Lei wrote: > On Mon, Feb 29, 2016 at 12:45 AM, James Bottomley > <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > > On Sun, 2016-02-28 at 08:29 -0800, Christoph Hellwig wrote: > > > On Sun, Feb 28, 2016 at 08:26:46AM -0800, James Bottomley wrote: > > > > You mean in bio_add_page() the code which currently aggregates > > > > chunks within a page could build a bio vec entry up to the max > > > > segment size? I think that is reasonable, especially now the > > > > bio > > > > splitting code can actually split inside a bio vec entry. > > > > > > Yes. Kent has an old prototype that did this at: > > > > > > https://evilpiepirate.org/git/linux-bcache.git/log/?h=block_stuff > > > > > > I don't think any of that is reusable as-is, but the basic idea > > > is > > > sounds and very useful. > > > > The basic idea, yes, but the actual code in that tree would still > > have > > built up bv entries that are too big. We have to thread > > bio_add_page() > > with knowledge of the queue limits, which is somewhat hard since > > they're deliberately queue agnostic. Perhaps some global minimum > > queue > > segment size would work? > > IMO, we can just build contiguous segment simply into one vector > because bio_add_page() in hot path, then compute segments during > bio splitting from submit_bio() path by applying all kinds of queue > limit just like current way. We can debate this, but I'm dubious about the effectiveness. the reason we have biovecs and don't use one bio per page is efficiency. On large memory machines, most large IO transfers tend to be physically contiguous because the allocators make it so. The splitting code splits into bios not biovecs, so we'll likely end up with one bio per segment. Is that better than one page per large biovec? Not sure, someone will have to do careful benchmarking. I'm not saying lets not do this, just that it's not an obvious win. James -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html