On Wed, Apr 20, 2011 at 8:48 PM, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
memcg-kswapd visits each memcg in round-robin. But required
amounts of works depends on memcg' usage and hi/low watermark
and taking it into account will be good.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
---
include/linux/memcontrol.h | 1 +
mm/memcontrol.c | 17 +++++++++++++++++
mm/vmscan.c | 2 ++
3 files changed, 20 insertions(+)
Index: mmotm-Apr14/include/linux/memcontrol.h
===================================================================
--- mmotm-Apr14.orig/include/linux/memcontrol.h
+++ mmotm-Apr14/include/linux/memcontrol.h
@@ -98,6 +98,7 @@ extern bool mem_cgroup_kswapd_can_sleep(
extern struct mem_cgroup *mem_cgroup_get_shrink_target(void);
extern void mem_cgroup_put_shrink_target(struct mem_cgroup *mem);
extern wait_queue_head_t *mem_cgroup_kswapd_waitq(void);
+extern int mem_cgroup_kswapd_bonus(struct mem_cgroup *mem);
static inline
int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
Index: mmotm-Apr14/mm/memcontrol.c
===================================================================
--- mmotm-Apr14.orig/mm/memcontrol.c
+++ mmotm-Apr14/mm/memcontrol.c
@@ -4673,6 +4673,23 @@ struct memcg_kswapd_work
struct memcg_kswapd_work memcg_kswapd_control;
+int mem_cgroup_kswapd_bonus(struct mem_cgroup *mem)
+{
+ unsigned long long usage, lowat, hiwat;
+ int rate;
+
+ usage = res_counter_read_u64(&mem->res, RES_USAGE);
+ lowat = res_counter_read_u64(&mem->res, RES_LOW_WMARK_LIMIT);
+ hiwat = res_counter_read_u64(&mem->res, RES_HIGH_WMARK_LIMIT);
+ if (lowat == hiwat)
+ return 0;
+
+ rate = (usage - hiwat) * 10 / (lowat - hiwat);
+ /* If usage is big, we reclaim more */
+ return rate * SWAP_CLUSTER_MAX;
+}
+
I understand the logic in general, which we would like to reclaim more each time if more work needs to be done. But not quite sure the calculation here, the (usage - hiwat) determines the amount of work of kswapd. And why divide by (lowat - hiwat)? My guess is because the larger the value, the later we will trigger kswapd?
--Ying
static void wake_memcg_kswapd(struct mem_cgroup *mem)
{
if (atomic_read(&mem->kswapd_running)) /* already running */
Index: mmotm-Apr14/mm/vmscan.c
===================================================================
--- mmotm-Apr14.orig/mm/vmscan.c
+++ mmotm-Apr14/mm/vmscan.c
@@ -2732,6 +2732,8 @@ static int shrink_mem_cgroup(struct mem_
sc.nr_reclaimed = 0;
total_scanned = 0;
+ sc.nr_to_reclaim += mem_cgroup_kswapd_bonus(mem_cont);
+
do_nodes = node_states[N_ONLINE];
for (priority = DEF_PRIORITY;