Re: [PATCH V5 5/5] mm: memcg discount pages under softlimit from per-zone reclaimable_pages

Ying Han <yinghan@xxxxxxxxxx> · Tue, 19 Jun 2012 20:51:44 -0700

On Tue, Jun 19, 2012 at 5:05 AM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> On Mon, Jun 18, 2012 at 09:47:31AM -0700, Ying Han wrote:
>> The function zone_reclaimable() marks zone->all_unreclaimable based on
>> per-zone pages_scanned and reclaimable_pages. If all_unreclaimable is true,
>> alloc_pages could go to OOM instead of getting stuck in page reclaim.
>
> There is no zone->all_unreclaimable at this point, you removed it in
> the previous patch.

Ah, forgot to update the commit log after applying the recent patch from Kosaki.

>> In memcg kernel, cgroup under its softlimit is not targeted under global
>> reclaim. So we need to remove those pages from reclaimable_pages, otherwise
>> it will cause reclaim mechanism to get stuck trying to reclaim from
>> all_unreclaimable zone.
>
> Can't you check if zone->pages_scanned changed in between reclaim
> runs?
>
> Or sum up the scanned and reclaimable pages encountered while
> iterating the hierarchy during regular reclaim and then use those
> numbers in the equation instead of the per-zone counters?
>
> Walking the full global hierarchy in all the places where we check if
> a zone is reclaimable is a scalability nightmare.

I agree on that, i will exploring a bit more on that.

>
>> @@ -100,18 +100,36 @@ static __always_inline enum lru_list page_lru(struct page *page)
>>       return lru;
>>  }
>>
>> +static inline unsigned long get_lru_size(struct lruvec *lruvec,
>> +                                      enum lru_list lru)
>> +{
>> +     if (!mem_cgroup_disabled())
>> +             return mem_cgroup_get_lru_size(lruvec, lru);
>> +
>> +     return zone_page_state(lruvec_zone(lruvec), NR_LRU_BASE + lru);
>> +}
>> +
>>  static inline unsigned long zone_reclaimable_pages(struct zone *zone)
>>  {
>> -     int nr;
>> +     int nr = 0;
>> +     struct mem_cgroup *memcg;
>> +
>> +     memcg = mem_cgroup_iter(NULL, NULL, NULL);
>> +     do {
>> +             struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, memcg);
>>
>> -     nr = zone_page_state(zone, NR_ACTIVE_FILE) +
>> -          zone_page_state(zone, NR_INACTIVE_FILE);
>> +             if (should_reclaim_mem_cgroup(memcg)) {
>> +                     nr += get_lru_size(lruvec, LRU_INACTIVE_FILE) +
>> +                           get_lru_size(lruvec, LRU_ACTIVE_FILE);
>
> Sometimes, the number of reclaimable pages DO include those of groups
> for which should_reclaim_mem_cgroup() is false: when the priority
> level is <= DEF_PRIORITY - 2, as you defined in 1/5!  This means that
> you consider pages you just scanned unreclaimable, which can result in
> the zone being unreclaimable after the DEF_PRIORITY - 2 cycle, no?

That is true and I thought about it as well. I would as well adding
the priority check here where only start considering the pages if the
priority < DEF_PRIORITY - 2

--Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href