The quilt patch titled Subject: mm/swap, workingset: make anon shadow nodes memcg aware has been removed from the -mm tree. Its filename was mm-swap-workingset-make-anon-shadow-nodes-memcg-aware.patch This patch was dropped because it was nacked ------------------------------------------------------ From: Kairui Song <kasong@xxxxxxxxxxx> Subject: mm/swap, workingset: make anon shadow nodes memcg aware Date: Tue, 20 Aug 2024 17:23:59 +0800 Currently, the workingset (shadow) nodes of the swap cache are not accounted to their corresponding memory cgroup, instead, they are all accounted to the root cgroup. This leads to inaccurate accounting and ineffective reclaiming. One cgroup could swap out a large amount of memory, take up a large amount of memory with shadow nodes without being accounted. This issue is similar to commit 7b785645e8f1 ("mm: fix page cache convergence regression"), where page cache shadow nodes were incorrectly accounted. That was due to the accidental dropping of the accounting flag during the XArray conversion in commit a28334862993 ("page cache: Finish XArray conversion"). However, this fix has a different cause. Swap cache shadow nodes were never accounted even before the XArray conversion, since they did not exist until commit 3852f6768ede ("mm/swapcache: support to handle the shadow entries"), which was years after the XArray conversion. It's worth noting that one anon shadow Xarray node may contain different entries from different cgroup, and it gets accounted at reclaim time, so it's arguable which cgroup it should be accounted to (as Shakeal Butt pointed out [1]). File pages may suffer similar issue but less common. Things like proactive memory reclaim could make thing more complex. So this commit still can't provide a 100% accurate accounting of anon shadows, but it covers the cases when one memory cgroup uses significant amount of swap, and in most cases memory pressure in one cgroup only suppose to reclaim this cgroup and children. Besides, this fix is clean and easy enough. Link: https://lore.kernel.org/all/7gzevefivueqtebzvikzbucnrnpurmh3scmfuiuo2tnrs37xso@haj7gzepjur2/ [1] Link: https://lkml.kernel.org/r/20240820092359.97782-1-ryncsn@xxxxxxxxx Signed-off-by: Kairui Song <kasong@xxxxxxxxxxx> Cc: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> Cc: Chris Li <chrisl@xxxxxxxxxx> Cc: "Huang, Ying" <ying.huang@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Muchun Song <muchun.song@xxxxxxxxx> Cc: Nhat Pham <nphamcs@xxxxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx> Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/swap_state.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/mm/swap_state.c~mm-swap-workingset-make-anon-shadow-nodes-memcg-aware +++ a/mm/swap_state.c @@ -97,6 +97,7 @@ int add_to_swap_cache(struct folio *foli void *old; xas_set_update(&xas, workingset_update_node); + xas_set_lru(&xas, &shadow_nodes); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio); @@ -718,7 +719,7 @@ int init_swap_address_space(unsigned int return -ENOMEM; for (i = 0; i < nr; i++) { space = spaces + i; - xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ); + xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ | XA_FLAGS_ACCOUNT); atomic_set(&space->i_mmap_writable, 0); space->a_ops = &swap_aops; /* swap cache doesn't use writeback related tags */ _ Patches currently in -mm which might be from kasong@xxxxxxxxxxx are mm-swap-clean-up-initialization-helper.patch mm-swap-skip-slot-cache-on-freeing-for-mthp.patch mm-swap-allow-cache-reclaim-to-skip-slot-cache.patch mm-swap-add-a-fragment-cluster-list.patch mm-swap-relaim-the-cached-parts-that-got-scanned.patch mm-swap-add-a-adaptive-full-cluster-cache-reclaim.patch