On Mon, Apr 25, 2011 at 2:34 AM, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
At memcg memory reclaim, get_scan_count() may returns [0, 0, 0, 0]
and no scan was not issued at the reclaim priority.
The reason is because memory cgroup may not be enough big to have
the number of pages, which is greater than 1 << priority.
Because priority affects many routines in vmscan.c, it's better
to scan memory even if usage >> priority < 0.
>From another point of view, if memcg's zone doesn't have enough memory which
meets priority, it should be skipped. So, this patch creates a temporal priority
in get_scan_count() and scan some amount of pages even when
usage is small. By this, memcg's reclaim goes smoother without
having too high priority, which will cause unnecessary congestion_wait(), etc.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
---
include/linux/memcontrol.h | 6 ++++++
mm/memcontrol.c | 5 +++++
mm/vmscan.c | 11 +++++++++++
3 files changed, 22 insertions(+)
Index: memcg/include/linux/memcontrol.h
===================================================================
--- memcg.orig/include/linux/memcontrol.h
+++ memcg/include/linux/memcontrol.h
@@ -152,6 +152,7 @@ unsigned long mem_cgroup_soft_limit_recl
gfp_t gfp_mask,
unsigned long *total_scanned);
u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
+u64 mem_cgroup_get_usage(struct mem_cgroup *mem);
void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -357,6 +358,11 @@ u64 mem_cgroup_get_limit(struct mem_cgro
return 0;
}
+static inline u64 mem_cgroup_get_limit(struct mem_cgroup *mem)
+{
+ return 0;
+}
+
should be mem_cgroup_get_usage()
static inline void mem_cgroup_split_huge_fixup(struct page *head,
struct page *tail)
{
Index: memcg/mm/memcontrol.c
===================================================================
--- memcg.orig/mm/memcontrol.c
+++ memcg/mm/memcontrol.c
@@ -1483,6 +1483,11 @@ u64 mem_cgroup_get_limit(struct mem_cgro
return min(limit, memsw);
}
+u64 mem_cgroup_get_usage(struct mem_cgroup *memcg)
+{
+ return res_counter_read_u64(&memcg->res, RES_USAGE);
+}
+
/*
* Visit the first child (need not be the first child as per the ordering
* of the cgroup list, since we track last_scanned_child) of @mem and use
Index: memcg/mm/vmscan.c
===================================================================
--- memcg.orig/mm/vmscan.c
+++ memcg/mm/vmscan.c
@@ -1762,6 +1762,17 @@ static void get_scan_count(struct zone *
denominator = 1;
goto out;
}
+ } else {
+ u64 usage;
+ /*
+ * When memcg is enough small, anon+file >> priority
+ * can be 0 and we'll do no scan. Adjust it to proper
+ * value against its usage. If this zone's usage is enough
+ * small, scan will ignore this zone until priority goes down.
+ */
+ for (usage = mem_cgroup_get_usage(sc->mem_cgroup) >> PAGE_SHIFT;
+ priority && ((usage >> priority) < SWAP_CLUSTER_MAX);
+ priority--);
}
--Ying
/*