Johannes Weiner <hannes@xxxxxxxxxxx> writes: > On Thu, May 27, 2021 at 04:49:53PM +0800, Huang Ying wrote: >> With commit 09854ba94c6a ("mm: do_wp_page() simplification"), after >> COW, the idle swap cache (neither the page nor the corresponding swap >> entry is mapped by any process) will be left at the original position >> in the LRU list. While it may be in the active list or the head of >> the inactive list, so that vmscan may take more overhead or time to >> reclaim these actually unused pages. >> >> To help the page reclaiming, in this patch, after COW, the idle swap >> cache will be tried to be moved to the tail of the inactive LRU list. >> To avoid to introduce much overhead to the hot COW code path, all >> locks are acquired with try locking. >> >> To test the patch, we used pmbench memory accessing benchmark with >> working-set larger than available memory on a 2-socket Intel server >> with a NVMe SSD as swap device. Test results shows that the pmbench >> score increases up to 21.8% with the decreased size of swap cache and >> swapin throughput. >> >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> >> Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx> >> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> >> Cc: Peter Xu <peterx@xxxxxxxxxx> >> Cc: Hugh Dickins <hughd@xxxxxxxxxx> >> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> >> Cc: Mel Gorman <mgorman@xxxxxxx> >> Cc: Rik van Riel <riel@xxxxxxxxxxx> >> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> >> Cc: Michal Hocko <mhocko@xxxxxxxxxx> >> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> >> Cc: Tim Chen <tim.c.chen@xxxxxxxxx> >> >> V2: >> >> - Move trylock_page() to try_to_free_idle_swapcache() per Rik and >> Linus' comments. >> - Fix PageLRU() checking. >> - Fix THP processing. >> - Rename the function. >> --- >> include/linux/memcontrol.h | 10 ++++++++++ >> include/linux/swap.h | 3 +++ >> mm/memcontrol.c | 12 ++++++++++++ >> mm/memory.c | 2 ++ >> mm/swapfile.c | 39 ++++++++++++++++++++++++++++++++++++++ >> 5 files changed, 66 insertions(+) > > Sorry the discussion fizzled out on the last patch. > > Let me try to recap this series: on your first submission you directly > freed the old page if we copied. Linus was worried about overhead in > the COW path that wouldn't pay off in a real workload. Before getting > numbers, it was then suggested to move the pages to the tail of the > LRU and leaving them to reclaim - which was also met with skepticism. > > V2 presented the LRU moving version with pmbench numbers that indeed > show it pays off. However, much simpler direct freeing produces even > better numbers in the same benchmark. We don't have numbers showing if > the LRU shuffling would significantly fare better in other workloads. > > Purely looking at the code: whether we defer or free, we need to lock > the page, take the LRU spinlock for this page, and touch the LRU > linkage. If we free, we add the swapcache deletion and the page > allocator, but it's most likely the percpu-cached fastpath. If we > defer, reclaim needs to re-establish information about the page that > we already had in the COW context, do another LRU operation, do the > swapcache deletion and go through the allocator, but on cold caches. > > Personally, I'm a bit skeptical the extra code complexity and reclaim > overhead in paging workloads will definitely pay off in intermittently > paging loads (non-paging wouldn't have swap pages). As far as code > goes, the point of 09854ba94c6a (+17, -42) was simplification, and > this adds more lines back in another place. In particular it adds > another lifetime variant to swap pages which are already somewhat > unwieldy. OTOH, freeing is a two-liner reusing the swap unmap code: > > if (page_copied) > free_swap_cache(old_page); Yes. This looks better than my previous version, which duplicated the code of free_swap_cache(). Thanks for pointing this out. Best Regards, Huang, Ying