lruvec_lru_size() is involving lruvec_page_state_local() to get the lru_size in the current code. It's base on lruvec_stat_local.count[] of mem_cgroup_per_node. This counter is updated in batch. It won't do charge if the number of coming pages doesn't meet the needs of MEMCG_CHARGE_BATCH who's defined as 32 now. This causes small section of memory can't be handled as expected in some scenario. For example, if we have only 32 pages madvise free memory in memcgroup, these pages won't be freed as expected when it meets memory pressure in this group. Getting lru_size base on lru_zone_size of mem_cgroup_per_node which is not updated in batch can make this a bit more accurate. Signed-off-by: Honglei Wang <honglei.wang@xxxxxxxxxx> --- mm/vmscan.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c77d1e3761a7..c28672460868 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -354,12 +354,13 @@ unsigned long zone_reclaimable_pages(struct zone *zone) */ unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx) { - unsigned long lru_size; + unsigned long lru_size = 0; int zid; - if (!mem_cgroup_disabled()) - lru_size = lruvec_page_state_local(lruvec, NR_LRU_BASE + lru); - else + if (!mem_cgroup_disabled()) { + for (zid = 0; zid < MAX_NR_ZONES; zid++) + lru_size += mem_cgroup_get_zone_lru_size(lruvec, lru, zid); + } else lru_size = node_page_state(lruvec_pgdat(lruvec), NR_LRU_BASE + lru); for (zid = zone_idx + 1; zid < MAX_NR_ZONES; zid++) { -- 2.17.0