Re: [PATCH v2 1/2] iov_iter: optimise iov_iter_npages for bvec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20/11/2020 17:22, Pavel Begunkov wrote:
> On 20/11/2020 02:24, Ming Lei wrote:
>> On Fri, Nov 20, 2020 at 02:06:10AM +0000, Matthew Wilcox wrote:
>>> On Fri, Nov 20, 2020 at 01:56:22AM +0000, Pavel Begunkov wrote:
>>>> On 20/11/2020 01:49, Matthew Wilcox wrote:
>>>>> On Fri, Nov 20, 2020 at 01:39:05AM +0000, Pavel Begunkov wrote:
>>>>>> On 20/11/2020 01:20, Matthew Wilcox wrote:
>>>>>>> On Thu, Nov 19, 2020 at 11:24:38PM +0000, Pavel Begunkov wrote:
>>>>>>>> The block layer spends quite a while in iov_iter_npages(), but for the
>>>>>>>> bvec case the number of pages is already known and stored in
>>>>>>>> iter->nr_segs, so it can be returned immediately as an optimisation
>>>>>>>
>>>>>>> Er ... no, it doesn't.  nr_segs is the number of bvecs.  Each bvec can
>>>>>>> store up to 4GB of contiguous physical memory.
>>>>>>
>>>>>> Ah, really, missed min() with PAGE_SIZE in bvec_iter_len(), then it's a
>>>>>> stupid statement. Thanks!
>>>>>>
>>>>>> Are there many users of that? All these iterators are a huge burden,
>>>>>> just to count one 4KB page in bvec it takes 2% of CPU time for me.
>>>>>
>>>>> __bio_try_merge_page() will create multipage BIOs, and that's
>>>>> called from a number of places including
>>>>> bio_try_merge_hw_seg(), bio_add_page(), and __bio_iov_iter_get_pages()
>>>>
>>>> I get it that there are a lot of places, more interesting how often
>>>> it's actually triggered and if that's performance critical for anybody.
>>>> Not like I'm going to change it, just out of curiosity, but bvec.h
>>>> can be nicely optimised without it.
>>>
>>> Typically when you're allocating pages for the page cache, they'll get
>>> allocated in order and then you'll read or write them in order, so yes,
>>> it ends up triggering quite a lot.  There was once a bug in the page
>>> allocator which caused them to get allocated in reverse order and it
>>> was a noticable performance hit (this was 15-20 years ago).
>>
>> hugepage use cases can benefit much from this way too.
> 
> This didn't yield any considerable boost for me though. 1.5% -> 1.3%
> for 1 page reads. I'll send it anyway though because there are cases
> that can benefit, e.g. as Ming mentioned.

And yeah, it just shifts my attention for optimisation to its callers,
e.g. blkdev_direct_IO.

> Ming would you want to send the patch yourself? After all you did post
> it first.
> 

-- 
Pavel Begunkov



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux