On Thu, Apr 12, 2012 at 6:00 AM, Ying Han <yinghan@xxxxxxxxxx> wrote: > The function zone_reclaimable() marks zone->all_unreclaimable based on > per-zone pages_scanned and reclaimable_pages. If all_unreclaimable is true, > alloc_pages could go to OOM instead of getting stuck in page reclaim. > > In memcg kernel, cgroup under its softlimit is not targeted under global > reclaim. It could be possible that all memcgs are under their softlimit for > a particular zone. So the direct reclaim do_try_to_free_pages() will always > return 1 which causes the caller __alloc_pages_direct_reclaim() enter tight > loop. > > The reclaim priority check we put in should_reclaim_mem_cgroup() should help > this case, but we still don't want to burn cpu cycles for first few priorities > to get to that point. The idea is from LSF discussion where we detect it after > the first round of scanning and restart the reclaim by not looking at softlimit > at all. This allows us to make forward progress on shrink_zone() and free some > pages on the zone. > > In order to do the detection for scanning all the memcgs under shrink_zone(), > i have to change the mem_cgroup_iter() from shared walk to full walk. Otherwise, > it would be very easy to skip lots of memcgs above softlimit and it causes the > flag "ignore_softlimit" being mistakenly set. > Perhaps that detection could be covered by return target_mem_cgroup || mem_cgroup_soft_limit_exceeded(memcg) || (priority <= DEF_PRIORITY - 3); then consider replacing shared walk with full walk. > Signed-off-by: Ying Han <yinghan@xxxxxxxxxx> > --- > mm/vmscan.c | 23 ++++++++++++++++------- > 1 files changed, 16 insertions(+), 7 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 2dbc300..d65eae4 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2158,21 +2158,25 @@ static void shrink_zone(int priority, struct zone *zone, > struct scan_control *sc) > { > struct mem_cgroup *root = sc->target_mem_cgroup; > - struct mem_cgroup_reclaim_cookie reclaim = { > - .zone = zone, > - .priority = priority, > - }; > struct mem_cgroup *memcg; > + int above_softlimit, ignore_softlimit = 0; > + > > - memcg = mem_cgroup_iter(root, NULL, &reclaim); > +restart: > + above_softlimit = 0; > + memcg = mem_cgroup_iter(root, NULL, NULL); > do { > struct mem_cgroup_zone mz = { > .mem_cgroup = memcg, > .zone = zone, > }; > > - if (should_reclaim_mem_cgroup(root, memcg, priority)) > + if (ignore_softlimit || > + should_reclaim_mem_cgroup(root, memcg, priority)) { > + > shrink_mem_cgroup_zone(priority, &mz, sc); > + above_softlimit = 1; > + } > > /* > * Limit reclaim has historically picked one memcg and > @@ -2188,8 +2192,13 @@ static void shrink_zone(int priority, struct zone *zone, > mem_cgroup_iter_break(root, memcg); > break; > } > - memcg = mem_cgroup_iter(root, memcg, &reclaim); > + memcg = mem_cgroup_iter(root, memcg, NULL); > } while (memcg); > + > + if (!above_softlimit) { > + ignore_softlimit = 1; > + goto restart; > + } > } > > /* Returns true if compaction should go ahead for a high-order request */ > -- > 1.7.7.3 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href