On 20/11/2020 17:22, Pavel Begunkov wrote: > On 20/11/2020 02:24, Ming Lei wrote: >> On Fri, Nov 20, 2020 at 02:06:10AM +0000, Matthew Wilcox wrote: >>> On Fri, Nov 20, 2020 at 01:56:22AM +0000, Pavel Begunkov wrote: >>>> On 20/11/2020 01:49, Matthew Wilcox wrote: >>>>> On Fri, Nov 20, 2020 at 01:39:05AM +0000, Pavel Begunkov wrote: >>>>>> On 20/11/2020 01:20, Matthew Wilcox wrote: >>>>>>> On Thu, Nov 19, 2020 at 11:24:38PM +0000, Pavel Begunkov wrote: >>>>>>>> The block layer spends quite a while in iov_iter_npages(), but for the >>>>>>>> bvec case the number of pages is already known and stored in >>>>>>>> iter->nr_segs, so it can be returned immediately as an optimisation >>>>>>> >>>>>>> Er ... no, it doesn't. nr_segs is the number of bvecs. Each bvec can >>>>>>> store up to 4GB of contiguous physical memory. >>>>>> >>>>>> Ah, really, missed min() with PAGE_SIZE in bvec_iter_len(), then it's a >>>>>> stupid statement. Thanks! >>>>>> >>>>>> Are there many users of that? All these iterators are a huge burden, >>>>>> just to count one 4KB page in bvec it takes 2% of CPU time for me. >>>>> >>>>> __bio_try_merge_page() will create multipage BIOs, and that's >>>>> called from a number of places including >>>>> bio_try_merge_hw_seg(), bio_add_page(), and __bio_iov_iter_get_pages() >>>> >>>> I get it that there are a lot of places, more interesting how often >>>> it's actually triggered and if that's performance critical for anybody. >>>> Not like I'm going to change it, just out of curiosity, but bvec.h >>>> can be nicely optimised without it. >>> >>> Typically when you're allocating pages for the page cache, they'll get >>> allocated in order and then you'll read or write them in order, so yes, >>> it ends up triggering quite a lot. There was once a bug in the page >>> allocator which caused them to get allocated in reverse order and it >>> was a noticable performance hit (this was 15-20 years ago). >> >> hugepage use cases can benefit much from this way too. > > This didn't yield any considerable boost for me though. 1.5% -> 1.3% > for 1 page reads. I'll send it anyway though because there are cases > that can benefit, e.g. as Ming mentioned. And yeah, it just shifts my attention for optimisation to its callers, e.g. blkdev_direct_IO. > Ming would you want to send the patch yourself? After all you did post > it first. > -- Pavel Begunkov