On 27/01/2021 17:16, Christoph Hellwig wrote: > On Tue, Jan 05, 2021 at 07:43:38PM +0000, Pavel Begunkov wrote: >> blk_bio_segment_split() is very heavy, but the current fast path covers >> only one-segment under PAGE_SIZE bios. Add another one by estimating an >> upper bound of sectors a bio can contain. >> >> One restricting factor here is queue_max_segment_size(), which it >> compare against full iter size to not dig into bvecs. By default it's >> 64KB, and so for requests under 64KB, but for those falling under the >> conditions it's much faster. > > I think this works, but it is a pretty gross heuristic, which also bio->bi_iter.bi_size <= queue_max_segment_size(q) Do you mean this, right? I wouldn't say it's gross, but _very_ loose. > doesn't help us with NVMe, which is the I/O fast path of choice for > most people. What is your use/test case? Yeah, the idea to make it work for NVMe. Don't remember it restricting segment size or whatever, maybe only atomicity. Which condition do you see problematic? I can't recall without opening the spec. > >> + /* >> + * Segments are contiguous, so only their ends may be not full. >> + * An upper bound for them would to assume that each takes 1B >> + * but adds a sector, and all left are just full sectors. >> + * Note: it's ok to round size down because all not full >> + * sectors are accounted by the first term. >> + */ >> + max_sectors = bio_segs * 2; >> + max_sectors += bio->bi_iter.bi_size >> 9; >> + >> + if (max_sectors < q_max_sectors) { > > I don't think we need the max_sectors variable here. Even more, considering that it's already sector aligned we can kill off all this section and just use (bi_iter.bi_size >> 9). -- Pavel Begunkov