Re: + mm-swap-workingset-make-anon-shadow-nodes-memcg-aware.patch added to mm-unstable branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andrew,

I want to request to drop this patch from the mm-tree. At the moment we
have no evidence that it is solving a real problem but we know that it
will cause unexpected charging of the xarray nodes.

thanks,
Shakeel

On Tue, Aug 20, 2024 at 05:48:15PM GMT, Andrew Morton wrote:
> 
> The patch titled
>      Subject: mm/swap, workingset: make anon shadow nodes memcg aware
> has been added to the -mm mm-unstable branch.  Its filename is
>      mm-swap-workingset-make-anon-shadow-nodes-memcg-aware.patch
> 
> This patch will shortly appear at
>      https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-swap-workingset-make-anon-shadow-nodes-memcg-aware.patch
> 
> This patch will later appear in the mm-unstable branch at
>     git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
> 
> The -mm tree is included into linux-next via the mm-everything
> branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> and is updated there every 2-3 working days
> 
> ------------------------------------------------------
> From: Kairui Song <kasong@xxxxxxxxxxx>
> Subject: mm/swap, workingset: make anon shadow nodes memcg aware
> Date: Tue, 20 Aug 2024 17:23:59 +0800
> 
> Currently, the workingset (shadow) nodes of the swap cache are not
> accounted to their corresponding memory cgroup, instead, they are all
> accounted to the root cgroup.  This leads to inaccurate accounting and
> ineffective reclaiming.  One cgroup could swap out a large amount of
> memory, take up a large amount of memory with shadow nodes without being
> accounted.
> 
> This issue is similar to commit 7b785645e8f1 ("mm: fix page cache
> convergence regression"), where page cache shadow nodes were incorrectly
> accounted.  That was due to the accidental dropping of the accounting flag
> during the XArray conversion in commit a28334862993 ("page cache: Finish
> XArray conversion").
> 
> However, this fix has a different cause.  Swap cache shadow nodes were
> never accounted even before the XArray conversion, since they did not
> exist until commit 3852f6768ede ("mm/swapcache: support to handle the
> shadow entries"), which was years after the XArray conversion.
> 
> It's worth noting that one anon shadow Xarray node may contain different
> entries from different cgroup, and it gets accounted at reclaim time, so
> it's arguable which cgroup it should be accounted to (as Shakeal Butt
> pointed out [1]).  File pages may suffer similar issue but less common. 
> Things like proactive memory reclaim could make thing more complex.
> 
> So this commit still can't provide a 100% accurate accounting of anon
> shadows, but it covers the cases when one memory cgroup uses significant
> amount of swap, and in most cases memory pressure in one cgroup only
> suppose to reclaim this cgroup and children.  Besides, this fix is clean
> and easy enough.
> 
> Link: https://lore.kernel.org/all/7gzevefivueqtebzvikzbucnrnpurmh3scmfuiuo2tnrs37xso@haj7gzepjur2/ [1]
> Link: https://lkml.kernel.org/r/20240820092359.97782-1-ryncsn@xxxxxxxxx
> Signed-off-by: Kairui Song <kasong@xxxxxxxxxxx>
> Cc: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
> Cc: Chris Li <chrisl@xxxxxxxxxx>
> Cc: "Huang, Ying" <ying.huang@xxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
> Cc: Muchun Song <muchun.song@xxxxxxxxx>
> Cc: Nhat Pham <nphamcs@xxxxxxxxx>
> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx>
> Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx>
> Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
> 
>  mm/swap_state.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- a/mm/swap_state.c~mm-swap-workingset-make-anon-shadow-nodes-memcg-aware
> +++ a/mm/swap_state.c
> @@ -97,6 +97,7 @@ int add_to_swap_cache(struct folio *foli
>  	void *old;
>  
>  	xas_set_update(&xas, workingset_update_node);
> +	xas_set_lru(&xas, &shadow_nodes);
>  
>  	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
>  	VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
> @@ -718,7 +719,7 @@ int init_swap_address_space(unsigned int
>  		return -ENOMEM;
>  	for (i = 0; i < nr; i++) {
>  		space = spaces + i;
> -		xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ);
> +		xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ | XA_FLAGS_ACCOUNT);
>  		atomic_set(&space->i_mmap_writable, 0);
>  		space->a_ops = &swap_aops;
>  		/* swap cache doesn't use writeback related tags */
> _
> 
> Patches currently in -mm which might be from kasong@xxxxxxxxxxx are
> 
> mm-swap-clean-up-initialization-helper.patch
> mm-swap-skip-slot-cache-on-freeing-for-mthp.patch
> mm-swap-allow-cache-reclaim-to-skip-slot-cache.patch
> mm-swap-add-a-fragment-cluster-list.patch
> mm-swap-relaim-the-cached-parts-that-got-scanned.patch
> mm-swap-add-a-adaptive-full-cluster-cache-reclaim.patch
> mm-swap-workingset-make-anon-shadow-nodes-memcg-aware.patch
> 




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux