Please find my comments below. > More comments on the code bellow. > > [...] > >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 53b8201..d8e6ee6 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -1743,6 +1743,53 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, >> NULL, "Memory cgroup out of memory"); >> } >> >> +/* >> + * If a cgroup is under low limit or enough close to it, >> + * decrease speed of page scanning. >> + * >> + * mem_cgroup_low_limit_scale() returns a number >> + * from range [0, DEF_PRIORITY - 2], which is used >> + * in the reclaim code as a scanning priority modifier. >> + * >> + * If the low limit is not set, it returns 0; >> + * >> + * usage - low_limit > usage / 8 => 0 >> + * usage - low_limit > usage / 16 => 1 >> + * usage - low_limit > usage / 32 => 2 >> + * ... >> + * usage - low_limit > usage / (2 ^ DEF_PRIORITY - 3) => DEF_PRIORITY - 3 >> + * usage < low_limit => DEF_PRIORITY - 2 > > Could you clarify why you have used this calculation. The comment > exlaims _what_ is done but not _why_ it is done. > > It is also strange (and unexplained) that the low limit will work > differently depending on the memcg memory usage - bigger groups have a > bigger chance to be reclaimed even if they are under the limit. The idea is to decrease scanning speed smoothly. It's hard to explain why I used exact these numbers. It' like why DEF_PRIORITY is 12? Just because it works :). Of course, these numbers are an object for discussion/change. There is a picture in attachment that illustrates how low limits work: red line - memory usage of cgroup with low_limit set to 1Gb, blue line - memory usage of another cgroup, where I ran cat <large file> > /dev/null. >> + * >> + */ >> +unsigned int mem_cgroup_low_limit_scale(struct lruvec *lruvec) >> +{ >> + struct mem_cgroup_per_zone *mz; >> + struct mem_cgroup *memcg; >> + unsigned long long low_limit; >> + unsigned long long usage; >> + unsigned int i; >> + >> + mz = container_of(lruvec, struct mem_cgroup_per_zone, lruvec); >> + memcg = mz->memcg; >> + if (!memcg) >> + return 0; >> + >> + low_limit = res_counter_read_u64(&memcg->res, RES_LOW_LIMIT); >> + if (!low_limit) >> + return 0; >> + >> + usage = res_counter_read_u64(&memcg->res, RES_USAGE); >> + >> + if (usage < low_limit) >> + return DEF_PRIORITY - 2; >> + >> + for (i = 0; i < DEF_PRIORITY - 2; i++) >> + if (usage - low_limit > (usage >> (i + 3))) >> + break; > > why this doesn't depend in the current reclaim priority? How do you want to use reclaim priority here? I don't like an idea to start ignoring low limit on some priorities. In my implementation low_limit_scale just "increases" scanning priority, but no more than for 10 (DEF_PRIORITY - 2). So, if priority is 0-2, the reclaim works as if the priority were 10-12, that means "normal" slow reclaim. > >> + >> + return i; >> +} >> + >> static unsigned long mem_cgroup_reclaim(struct mem_cgroup *memcg, >> gfp_t gfp_mask, >> unsigned long flags) > > [...] > >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 88c5fed..9c1c702 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1660,6 +1660,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, >> bool force_scan = false; >> unsigned long ap, fp; >> enum lru_list lru; >> + unsigned int low_limit_scale = 0; >> >> /* >> * If the zone or memcg is small, nr[l] can be 0. This >> @@ -1779,6 +1780,9 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, >> fraction[1] = fp; >> denominator = ap + fp + 1; >> out: >> + if (global_reclaim(sc)) >> + low_limit_scale = mem_cgroup_low_limit_scale(lruvec); > > What if the group is reclaimed as a result from parent hitting its > limit? For now, low limits will work only for global reclaim. Enabling them for target reclaim will require some additional checks. I plan to do this as a separate change. Thank you for your comments! -- Regards, Roman
Attachment:
low_limit_memcg.gif
Description: GIF image