When memory reclaim failed for a maximum number of attempts and we bail out of the reclaim loop, we forgot to put the target mem_cgroup chosen for next reclaim back to the soft limit tree. This prevented pages in the mem_cgroup from being reclaimed in the future even though the mem_cgroup exceeded its soft limit. Let's say there are two mem_cgroup and both of them exceed the soft limit, while the first one is more active then the second. Since we add a mem_cgroup to soft limit tree every 1024 event, the second one just get a rare chance to be put on soft limit tree even it exceeds the limit. As time goes on, the first mem_cgroup was kept close to its soft limit due to reclaim activities, while the memory usage of the second mem_cgroup keeps growing over the soft limit for a long time due to its relatively rare occurrence. This patch adds next_mz back to prevent this sceanrio. Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx> --- mm/memcontrol.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 344a7e891bc5..e803ff02aae2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3493,8 +3493,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS)) break; } while (!nr_reclaimed); - if (next_mz) + if (next_mz) { + spin_lock_irq(&mctz->lock); + excess = soft_limit_excess(next_mz->memcg); + __mem_cgroup_insert_exceeded(next_mz, mctz, excess); + spin_unlock_irq(&mctz->lock); css_put(&next_mz->memcg->css); + } return nr_reclaimed; } -- 2.33.1