Re: [patch 2/8] mm: memcg-aware global reclaim

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Mon, Aug 29, 2011 at 12:04 PM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
On Mon, Aug 29, 2011 at 12:22:02AM -0700, Ying Han wrote:
> On Mon, Aug 29, 2011 at 12:15 AM, Ying Han <yinghan@xxxxxxxxxx> wrote:
> > On Thu, Aug 11, 2011 at 2:09 PM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> >>
> >> On Thu, Aug 11, 2011 at 01:39:45PM -0700, Ying Han wrote:
> >> > Please consider including the following patch for the next post. It causes
> >> > crash on some of the tests where sc->mem_cgroup is NULL (global kswapd).
> >> >
> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> > index b72a844..12ab25d 100644
> >> > --- a/mm/vmscan.c
> >> > +++ b/mm/vmscan.c
> >> > @@ -2768,7 +2768,8 @@ loop_again:
> >> >                          * Do some background aging of the anon list, to
> >> > give
> >> >                          * pages a chance to be referenced before
> >> > reclaiming.
> >> >                          */
> >> > -                       if (inactive_anon_is_low(zone, &sc))
> >> > +                       if (scanning_global_lru(&sc) &&
> >> > +                                       inactive_anon_is_low(zone, &sc))
> >> >                                 shrink_active_list(SWAP_CLUSTER_MAX, zone,
> >> >                                                         &sc, priority, 0);
> >>
> >> Thanks!  I completely overlooked this one and only noticed it after
> >> changing the arguments to shrink_active_list().
> >>
> >> On memcg configurations, scanning_global_lru() will essentially never
> >> be true again, so I moved the anon pre-aging to a separate function
> >> that also does a hierarchy loop to preage the per-memcg anon lists.
> >>
> >> I hope to send out the next revision soon.
> >
> > Also, please consider to fold in the following patch as well. It fixes
> > the root cgroup lru accounting and we could easily trigger OOM while
> > doing some swapoff test w/o it.
> >
> > mm:fix the lru accounting for root cgroup.
> >
> > This patch is applied on top of:
> > "
> > mm: memcg-aware global reclaim
> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > "
> >
> > This patch fixes the lru accounting for root cgroup.
> >
> > After the "memcg-aware global reclaim" patch, one of the changes is to have
> > lru pages linked back to root. Under the global memory pressure, we start from
> > the root cgroup lru and walk through the memcg hierarchy of the system. For
> > each memcg, we reclaim pages based on the its lru size.
> >
> > However for root cgroup, we used not having a seperate lru and only counting
> > the pages charged to root as part of root lru size. Without this patch, all
> > the pages which are linked to root lru but not charged to root like swapcache
> > readahead are not visible to page reclaim code and we are easily to get OOM.
> >
> > After this patch, all the pages linked under root lru are counted in the lru
> > size, including Used and !Used.
> >
> > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> > Signed-off-by: Ying Han <yinghan@xxxxxxxxxx>
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 5518f54..f6c5f29 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -888,19 +888,21 @@ void mem_cgroup_del_lru_list(struct page *page,
> > enum lru_list lru)
> >  {
> >  >------struct page_cgroup *pc;
> >  >------struct mem_cgroup_per_zone *mz;
> > +>------struct mem_cgroup *mem;
> > ·
> >  >------if (mem_cgroup_disabled())
> >  >------>-------return;
> >  >------pc = lookup_page_cgroup(page);
> > ->------/* can happen while we handle swapcache. */
> > ->------if (!TestClearPageCgroupAcctLRU(pc))
> > ->------>-------return;
> > ->------VM_BUG_ON(!pc->mem_cgroup);
> > ->------/*
> > ->------ * We don't check PCG_USED bit. It's cleared when the "page" is finally
> > ->------ * removed from global LRU.
> > ->------ */
> > ->------mz = page_cgroup_zoneinfo(pc->mem_cgroup, page);
> > +
> > +>------if (TestClearPageCgroupAcctLRU(pc) || PageCgroupUsed(pc)) {

This PageCgroupUsed part confuses me.  A page that is being isolated
shortly after being charged while on the LRU may reach here, and then
it is unaccounted from pc->mem_cgroup, which it never was accounted
to.

Could you explain why you added it?

To be honest, i don't have very good reason for that. The PageCgroupUsed check is put there after running some tests and some fixes seems help the test, including this one.

The one case I can think of for page !AcctLRU | Used is in the pagevec. However, we shouldn't get to the mem_cgroup_del_lru_list() for a page in pagevec at the first place.

I now made it so that PageCgroupAcctLRU on the LRU means accounted to
pc->mem_cgroup,

this is the same logic currently. 
 
and !PageCgroupAcctLRU on the LRU means accounted to
and babysitted by root_mem_cgroup.  

this seems to be different from what it is now, especially for swapcache page. So, the page here is linked to root cgroup LRU or not?

Anyway, the AcctLRU flags still seems confusing to me:

what this flag tells me is that whether or not the page is on a PRIVATE lru and being accounted, i used private here to differentiate from the per zone lru, where it also has PageLRU flag.  The two flags are separate since pages could be on one lru not the other ( I guess ) , but this is changed after having the root cgroup lru back. For example, AcctLRU is used to keep track of the accounted lru pages, especially for root ( we didn't account the !Used pages to root like readahead swapcache). Now we account the full size of lru list of root including Used and !Used, but only mark the Used pages w/ AcctLRU flag. 

So in general, i am wondering we should be able to replace that eventually with existing Used and LRU bit.  Sorry this seems to be something we like to consider later, not necessarily now :)

Always.  Which also means that
before_commit now ensures an LRU page is moved to root_mem_cgroup for
babysitting during the charge, so that concurrent isolations/putbacks
are always accounted correctly.  Is this what you had in mind?  Did I
miss something?

In my tree, the before->commit->after protocol is folded into one function. I didn't post it since I know you also have patch doing that.  So guess I don't understand why we need to move the page to root while it is gonna be charged to a memcg by commit_charge shortly after.

My understanding is that in before_commit, we uncharge the page from previous memcg lru if AcctLRU was set, then in the commit_charge we update the new owner of it. And in after_commit we update the memcg lru for the new owner after linking the page in the lru. 

--Ying
 


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]