On Thu, Jun 9, 2011 at 8:00 AM, Michal Hocko <mhocko@xxxxxxx> wrote: > On Thu 02-06-11 22:25:29, Ying Han wrote: >> On Thu, Jun 2, 2011 at 2:55 PM, Ying Han <yinghan@xxxxxxxxxx> wrote: >> > On Tue, May 31, 2011 at 11:25 PM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote: >> >> Currently, soft limit reclaim is entered from kswapd, where it selects > [...] >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> >> index c7d4b44..0163840 100644 >> >> --- a/mm/vmscan.c >> >> +++ b/mm/vmscan.c >> >> @@ -1988,9 +1988,13 @@ static void shrink_zone(int priority, struct zone *zone, >> >> unsigned long reclaimed = sc->nr_reclaimed; >> >> unsigned long scanned = sc->nr_scanned; >> >> unsigned long nr_reclaimed; >> >> + int epriority = priority; >> >> + >> >> + if (mem_cgroup_soft_limit_exceeded(root, mem)) >> >> + epriority -= 1; >> > >> > Here we grant the ability to shrink from all the memcgs, but only >> > higher the priority for those exceed the soft_limit. That is a design >> > change >> > for the "soft_limit" which giving a hint to which memcgs to reclaim >> > from first under global memory pressure. >> >> >> Basically, we shouldn't reclaim from a memcg under its soft_limit >> unless we have trouble reclaim pages from others. > > Agreed. > >> Something like the following makes better sense: >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index bdc2fd3..b82ba8c 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1989,6 +1989,8 @@ restart: >> throttle_vm_writeout(sc->gfp_mask); >> } >> >> +#define MEMCG_SOFTLIMIT_RECLAIM_PRIORITY 2 >> + >> static void shrink_zone(int priority, struct zone *zone, >> struct scan_control *sc) >> { >> @@ -2001,13 +2003,13 @@ static void shrink_zone(int priority, struct zone *zone, >> unsigned long reclaimed = sc->nr_reclaimed; >> unsigned long scanned = sc->nr_scanned; >> unsigned long nr_reclaimed; >> - int epriority = priority; >> >> - if (mem_cgroup_soft_limit_exceeded(root, mem)) >> - epriority -= 1; >> + if (!mem_cgroup_soft_limit_exceeded(root, mem) && >> + priority > MEMCG_SOFTLIMIT_RECLAIM_PRIORITY) >> + continue; > > yes, this makes sense but I am not sure about the right(tm) value of the > MEMCG_SOFTLIMIT_RECLAIM_PRIORITY. 2 sounds too low. You would do quite a > lot of loops > (DEFAULT_PRIORITY-MEMCG_SOFTLIMIT_RECLAIM_PRIORITY) * zones * memcg_count > without any progress (assuming that all of them are under soft limit > which doesn't sound like a totally artificial configuration) until you > allow reclaiming from groups that are under soft limit. Then, when you > finally get to reclaiming, you scan rather aggressively. Fair enough, something smarter is definitely needed :) > > Maybe something like 3/4 of DEFAULT_PRIORITY? You would get 3 times > over all (unbalanced) zones and all cgroups that are above the limit > (scanning max{1/4096+1/2048+1/1024, 3*SWAP_CLUSTER_MAX} of the LRUs for > each cgroup) which could be enough to collect the low hanging fruit. Hmm, that sounds more reasonable than the initial proposal. For the same worst case where all the memcgs are blow their soft limit, we need to scan 3 times of total memcgs before actually doing anything. For that condition, I can not think of anything solve the problem totally unless we have separate list of memcg (like what do currently) per-zone. --Ying > -- > Michal Hocko > SUSE Labs > SUSE LINUX s.r.o. > Lihovarska 1060/12 > 190 00 Praha 9 > Czech Republic > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href