On Tue, Feb 26, 2019 at 06:57:16PM -0700, Jens Axboe wrote: > Speaking of this, I took a quick look at why we've now regressed a lot > on IOPS perf with the multipage work. It looks like it's all related to > the (much) fatter setup around iteration, which is related to this very > topic too. > Basically setup of things like bio_for_each_bvec() and indexing through > nth_page() is MUCH slower than before. I haven't quite figure out what the point of nth_page is. If we physically merge the page structures should also be consecuite in memory in general. The only case where this could theoretically not be the case is with CONFIG_DISCONTIGMEM, but in that case we should check this once in biovec_phys_mergeable, and only for that case. Does this patch make a difference for you on x86? --- a/include/linux/bvec.h +++ b/include/linux/bvec.h @@ -53,7 +53,7 @@ struct bvec_iter_all { static inline struct page *bvec_nth_page(struct page *page, int idx) { - return idx == 0 ? page : nth_page(page, idx); + return page + idx; } /*