The quilt patch titled Subject: mm, swap: avoid reclaiming irrelevant swap cache has been removed from the -mm tree. Its filename was mm-swap-avoid-reclaiming-irrelevant-swap-cache.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Kairui Song <kasong@xxxxxxxxxxx> Subject: mm, swap: avoid reclaiming irrelevant swap cache Date: Fri, 14 Mar 2025 00:59:29 +0800 Patch series "mm, swap: remove swap slot cache", v3. Slot cache was initially introduced by commit 67afa38e012e ("mm/swap: add cache for swap slots allocation") to reduce the lock contention of si->lock. Previous series "mm, swap: rework of swap allocator locks" [1] removed swap slot cache for freeing path as freeing path no longer touches si->lock in most cased. Allocation path also have slight to none contention on si->lock since that series, but slot cache still helps to reduce other overheads, like counters and the plist. This series removes the slot cache from allocation path too, by using the cluster as allocation fast path and also reduce other overheads. Now slot cache is completely gone, the code is much simplified without obvious feature or performance change, also clean up related workaround. Also this should avoid other potential issues, e.g. the long pinning of swap slots: swap slot cache pins swap slots with HAS_CACHE, causing reclaim or allocation fail to use these slots on scanning. The only behavior change is the swap device allocation rotation mechanism, as explained in the patch "mm, swap: use percpu cluster as allocation fast path". Test results are looking good after deleting the swap slot cache: - vm-scalability with: `usemem --init-time -O -y -x -R -31 1G`, 12G memory cgroup using simulated pmem as SWAP (32G pmem, 32 CPUs), 16 test runs for each case, measuring the total throughput: Before (KB/s) (stdev) After (KB/s) (stdev) Random (4K): 424907.60 (24410.78) 414745.92 (34554.78) Random (64K): 163308.82 (11635.72) 167314.50 (18434.99) Sequential (4K, !-R): 6150056.79 (103205.90) 6321469.06 (115878.16) - Build linux kernel with make -j96, using 4K folio with 1.5G memory cgroup limit and 64K folio with 2G memory cgroup limit, on top of tmpfs, 12 test runs, measuring the system time: Before (s) (stdev) After (s) (stdev) make -j96 (4K): 6445.69 (61.95) 6408.80 (69.46) make -j96 (64K): 6841.71 (409.04) 6437.99 (435.55) The performance is unchanged, slightly better in some cases. [1] https://lore.kernel.org/linux-mm/20250113175732.48099-1-ryncsn@xxxxxxxxx/ This patch (of 7): Swap allocator will do swap cache reclaim to recycle HAS_CACHE slots for allocation. It initiates the reclaim from the offset to be reclaimed and looks up the corresponding folio. The lookup process is lockless, so it's possible the folio will be removed from the swap cache and given a different swap entry before the reclaim locks the folio. If it happens, the reclaim will end up reclaiming an irrelevant folio, and return wrong return value. This shouldn't cause any problem with correctness or stability, but it is indeed confusing and unexpected, and will increase fragmentation, decrease performance. Fix this by checking whether the folio is still pointing to the offset the allocator want to reclaim before reclaiming it. Link: https://lkml.kernel.org/r/20250313165935.63303-1-ryncsn@xxxxxxxxx Link: https://lkml.kernel.org/r/20250313165935.63303-2-ryncsn@xxxxxxxxx Signed-off-by: Kairui Song <kasong@xxxxxxxxxxx> Reviewed-by: Baoquan He <bhe@xxxxxxxxxx> Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> Cc: Barry Song <v-songbaohua@xxxxxxxx> Cc: Chris Li <chrisl@xxxxxxxxxx> Cc: "Huang, Ying" <ying.huang@xxxxxxxxxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Kairui Song <kasong@xxxxxxxxxxx> Cc: Kalesh Singh <kaleshsingh@xxxxxxxxxx> Cc: Matthew Wilcow (Oracle) <willy@xxxxxxxxxxxxx> Cc: Nhat Pham <nphamcs@xxxxxxxxx> Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/swapfile.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/mm/swapfile.c~mm-swap-avoid-reclaiming-irrelevant-swap-cache +++ a/mm/swapfile.c @@ -210,6 +210,7 @@ static int __try_to_reclaim_swap(struct int ret, nr_pages; bool need_reclaim; +again: folio = filemap_get_folio(address_space, swap_cache_index(entry)); if (IS_ERR(folio)) return 0; @@ -227,8 +228,16 @@ static int __try_to_reclaim_swap(struct if (!folio_trylock(folio)) goto out; - /* offset could point to the middle of a large folio */ + /* + * Offset could point to the middle of a large folio, or folio + * may no longer point to the expected offset before it's locked. + */ entry = folio->swap; + if (offset < swp_offset(entry) || offset >= swp_offset(entry) + nr_pages) { + folio_unlock(folio); + folio_put(folio); + goto again; + } offset = swp_offset(entry); need_reclaim = ((flags & TTRS_ANYWAY) || _ Patches currently in -mm which might be from kasong@xxxxxxxxxxx are