From: Yafang Shao <laoar.shao@xxxxxxxxx> Subject: mm/vmscan.c: calculate reclaimed slab caches in all reclaim paths There are six different reclaim paths by now: - kswapd reclaim path - node reclaim path - hibernate preallocate memory reclaim path - direct reclaim path - memcg reclaim path - memcg softlimit reclaim path The slab caches reclaimed in these paths are only calculated in the above three paths. There're some drawbacks if we don't calculate the reclaimed slab caches. - The sc->nr_reclaimed isn't correct if there're some slab caches relcaimed in this path. - The slab caches may be reclaimed thoroughly if there're lots of reclaimable slab caches and few page caches. Let's take an easy example for this case. If one memcg is full of slab caches and the limit of it is 512M, in other words there're approximately 512M slab caches in this memcg. Then the limit of the memcg is reached and the memcg reclaim begins, and then in this memcg reclaim path it will continuesly reclaim the slab caches until the sc->priority drops to 0. After this reclaim stops, you will find there're few slab caches left, which is less than 20M in my test case. While after this patch applied the number is greater than 300M and the sc->priority only drops to 3. Link: http://lkml.kernel.org/r/1561112086-6169-3-git-send-email-laoar.shao@xxxxxxxxx Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> Reviewed-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> Reviewed-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/vmscan.c | 7 +++++++ 1 file changed, 7 insertions(+) --- a/mm/vmscan.c~mm-vmscan-calculate-reclaimed-slab-caches-in-all-reclaim-paths +++ a/mm/vmscan.c @@ -3194,11 +3194,13 @@ unsigned long try_to_free_pages(struct z if (throttle_direct_reclaim(sc.gfp_mask, zonelist, nodemask)) return 1; + current->reclaim_state = &sc.reclaim_state; trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask); nr_reclaimed = do_try_to_free_pages(zonelist, &sc); trace_mm_vmscan_direct_reclaim_end(nr_reclaimed); + current->reclaim_state = NULL; return nr_reclaimed; } @@ -3221,6 +3223,7 @@ unsigned long mem_cgroup_shrink_node(str }; unsigned long lru_pages; + current->reclaim_state = &sc.reclaim_state; sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) | (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK); @@ -3238,7 +3241,9 @@ unsigned long mem_cgroup_shrink_node(str trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed); + current->reclaim_state = NULL; *nr_scanned = sc.nr_scanned; + return sc.nr_reclaimed; } @@ -3265,6 +3270,7 @@ unsigned long try_to_free_mem_cgroup_pag .may_shrinkslab = 1, }; + current->reclaim_state = &sc.reclaim_state; /* * Unlike direct reclaim via alloc_pages(), memcg's reclaim doesn't * take care of from where we get pages. So the node where we start the @@ -3285,6 +3291,7 @@ unsigned long try_to_free_mem_cgroup_pag psi_memstall_leave(&pflags); trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed); + current->reclaim_state = NULL; return nr_reclaimed; } _