Re: [PATCH v2 0/5] batched remove rmap in try_to_unmap_one()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/2/2023 10:23 PM, David Hildenbrand wrote:
> On 02.03.23 14:32, Yin, Fengwei wrote:
>>
>>
>> On 3/2/2023 6:04 PM, David Hildenbrand wrote:
>>> On 01.03.23 02:44, Yin, Fengwei wrote:
>>>> On Tue, 2023-02-28 at 12:28 -0800, Andrew Morton wrote:
>>>>> On Tue, 28 Feb 2023 20:23:03 +0800 Yin Fengwei
>>>>> <fengwei.yin@xxxxxxxxx> wrote:
>>>>>
>>>>>> Testing done with the V2 patchset in a qemu guest
>>>>>> with 4G mem + 512M zram:
>>>>>>     - kernel mm selftest to trigger vmscan() and final hit
>>>>>>       try_to_unmap_one().
>>>>>>     - Inject hwpoison to hugetlb page to trigger try_to_unmap_one()
>>>>>>       call against hugetlb.
>>>>>>     - 8 hours stress testing: Firefox + kernel mm selftest + kernel
>>>>>>       build.
>>>>>
>>>>> Was any performance testing done with these changes?
>>>> I tried to collect the performance data. But found out that it's
>>>> not easy to trigger try_to_unmap_one() path (the only one I noticed
>>>> is to trigger page cache reclaim). And I am not aware of a workload
>>>> can show it. Do you have some workloads suggsted to run? Thanks.
>>>
>>> If it happens barely, why care about performance and have a "398 insertions(+), 260 deletions(-)" ?
>> I mean I can't find workload to trigger page cache reclaim and measure
>> its performance. We can do "echo 1 > /proc/sys/vm/drop_caches" to reclaim
>> page cache. But there is no obvious indicator which shows the advantage
>> of this patchset. Maybe I could try eBPF to capture some statistic of
>> try_to_unmap_one()?
> 
> If no workload/benchmark is affected (or simply corner cases where nobody cares about performance), I hope you understand that it's hard to argue why we should care about such an optimization then.
Yes. I understood this.

> 
> I briefly thought that page migration could benefit, but it always uses try_to_migrate().
Yes. try_to_migrate() shared very similar logic with try_to_unmap_one(). Same batched
operation apply to try_to_migrate() also.

> 
> So I guess we're fairly limited to vmscan (memory failure is a corner cases).
Agree.

> 
> I recall that there are some performance-sensitive swap-to-nvdimm test cases. As an alternative, one could eventually write a microbenchmark that measures MADV_PAGEOUT performance -- it should also end up triggering vmscan, but only if the page is mapped exactly once (in which case, I assume batch removal doesn't really help ?).
Yes. MADV_PAGEOUT can trigger vmscan. My understanding is that only one map
also could benefit from the batched operation also. Let me try to have
a microbenchmark based on MADV_PAGEOUT and see what we could get. Thanks.


Regards
Yin, Fengwei

> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux