On 3/2/2023 10:23 PM, David Hildenbrand wrote: > On 02.03.23 14:32, Yin, Fengwei wrote: >> >> >> On 3/2/2023 6:04 PM, David Hildenbrand wrote: >>> On 01.03.23 02:44, Yin, Fengwei wrote: >>>> On Tue, 2023-02-28 at 12:28 -0800, Andrew Morton wrote: >>>>> On Tue, 28 Feb 2023 20:23:03 +0800 Yin Fengwei >>>>> <fengwei.yin@xxxxxxxxx> wrote: >>>>> >>>>>> Testing done with the V2 patchset in a qemu guest >>>>>> with 4G mem + 512M zram: >>>>>> - kernel mm selftest to trigger vmscan() and final hit >>>>>> try_to_unmap_one(). >>>>>> - Inject hwpoison to hugetlb page to trigger try_to_unmap_one() >>>>>> call against hugetlb. >>>>>> - 8 hours stress testing: Firefox + kernel mm selftest + kernel >>>>>> build. >>>>> >>>>> Was any performance testing done with these changes? >>>> I tried to collect the performance data. But found out that it's >>>> not easy to trigger try_to_unmap_one() path (the only one I noticed >>>> is to trigger page cache reclaim). And I am not aware of a workload >>>> can show it. Do you have some workloads suggsted to run? Thanks. >>> >>> If it happens barely, why care about performance and have a "398 insertions(+), 260 deletions(-)" ? >> I mean I can't find workload to trigger page cache reclaim and measure >> its performance. We can do "echo 1 > /proc/sys/vm/drop_caches" to reclaim >> page cache. But there is no obvious indicator which shows the advantage >> of this patchset. Maybe I could try eBPF to capture some statistic of >> try_to_unmap_one()? > > If no workload/benchmark is affected (or simply corner cases where nobody cares about performance), I hope you understand that it's hard to argue why we should care about such an optimization then. Yes. I understood this. > > I briefly thought that page migration could benefit, but it always uses try_to_migrate(). Yes. try_to_migrate() shared very similar logic with try_to_unmap_one(). Same batched operation apply to try_to_migrate() also. > > So I guess we're fairly limited to vmscan (memory failure is a corner cases). Agree. > > I recall that there are some performance-sensitive swap-to-nvdimm test cases. As an alternative, one could eventually write a microbenchmark that measures MADV_PAGEOUT performance -- it should also end up triggering vmscan, but only if the page is mapped exactly once (in which case, I assume batch removal doesn't really help ?). Yes. MADV_PAGEOUT can trigger vmscan. My understanding is that only one map also could benefit from the batched operation also. Let me try to have a microbenchmark based on MADV_PAGEOUT and see what we could get. Thanks. Regards Yin, Fengwei >