The patch titled Subject: mm/memcg: add next_mz back to soft limit tree if not reclaimed yet has been added to the -mm tree. Its filename is mm-memcg-add-next_mz-back-to-soft-limit-tree-if-not-reclaimed-yet.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-add-next_mz-back-to-soft-limit-tree-if-not-reclaimed-yet.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-add-next_mz-back-to-soft-limit-tree-if-not-reclaimed-yet.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Wei Yang <richard.weiyang@xxxxxxxxx> Subject: mm/memcg: add next_mz back to soft limit tree if not reclaimed yet When memory reclaim failed for a maximum number of attempts and we bail out of the reclaim loop, we forgot to put the target mem_cgroup chosen for next reclaim back to the soft limit tree. This prevented pages in the mem_cgroup from being reclaimed in the future even though the mem_cgroup exceeded its soft limit. Let's say there are two mem_cgroup and both of them exceed the soft limit, while the first one is more active then the second. Since we add a mem_cgroup to soft limit tree every 1024 event, the second one just get a rare chance to be put on soft limit tree even it exceeds the limit. As time goes on, the first mem_cgroup was kept close to its soft limit due to reclaim activities, while the memory usage of the second mem_cgroup keeps growing over the soft limit for a long time due to its relatively rare occurrence. This patch adds next_mz back to prevent this scenario. Link: https://lkml.kernel.org/r/20220312071623.19050-3-richard.weiyang@xxxxxxxxx Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Shakeel Butt <shakeelb@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memcontrol.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/mm/memcontrol.c~mm-memcg-add-next_mz-back-to-soft-limit-tree-if-not-reclaimed-yet +++ a/mm/memcontrol.c @@ -3453,8 +3453,13 @@ unsigned long mem_cgroup_soft_limit_recl loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS)) break; } while (!nr_reclaimed); - if (next_mz) + if (next_mz) { + spin_lock_irq(&mctz->lock); + excess = soft_limit_excess(next_mz->memcg); + __mem_cgroup_insert_exceeded(next_mz, mctz, excess); + spin_unlock_irq(&mctz->lock); css_put(&next_mz->memcg->css); + } return nr_reclaimed; } _ Patches currently in -mm which might be from richard.weiyang@xxxxxxxxx are mm-memcg-mem_cgroup_per_node-is-already-set-to-0-on-allocation.patch mm-memcg-retrieve-parent-memcg-from-cssparent.patch mm-memcg-set-memcg-after-css-verified-and-got-reference.patch mm-memcg-set-pos-to-prev-unconditionally.patch mm-memcg-move-generation-assignment-and-comparison-together.patch mm-memcg-mz-already-removed-from-rb_tree-in-mem_cgroup_largest_soft_limit_node.patch mm-memcg-__mem_cgroup_remove_exceeded-could-handle-a-on-tree-mz-properly.patch mm-memcg-add-next_mz-back-to-soft-limit-tree-if-not-reclaimed-yet.patch mm-page_alloc-add-same-penalty-is-enough-to-get-round-robin-order.patch mm-page_alloc-add-penalty-to-local_node.patch memcg-do-not-tweak-node-in-alloc_mem_cgroup_per_node_info.patch