Re: [RFC][PATCH] memcg: remove PCG_ACCT_LRU.

KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> · Tue, 6 Dec 2011 19:21:01 +0900

On Mon, 5 Dec 2011 23:36:34 -0800 (PST)
Hugh Dickins <hughd@xxxxxxxxxx> wrote:

> On Tue, 6 Dec 2011, KAMEZAWA Hiroyuki wrote:
> > On Mon, 5 Dec 2011 16:13:06 -0800 (PST)
> > Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> > > 
> > > Ying and I found PageCgroupAcctLRU very hard to grasp, even despite
> > > the comments Hannes added to explain it.  
> > 
> > Now, I don't think it's difficult. It seems no file system codes
> > add pages to LRU before add_to_page_cache() (I checked.)
> > So, what we need to care is only swap-cache. In swap-cache path,
> > we can do slow work.
> 
> I've been reluctant to add more special code for SwapCache:
> it may or may not be a good idea.  Hannes also noted a FUSE
> case which requires the before-commit-after handling swap was
> using (for memcg-zone lru locking we've merged them into commit).
> 

I think we need a fix for FUSE. In past, FUSE/splice used
add_to_page_cache() but not it uses replace_page_cache().
So, we need another care. (I posted a patch.)

> > 
> > > In moving the LRU locking
> > > from zone to memcg, we needed to depend upon pc->mem_cgroup: that
> > > was difficult while the interpretation of pc->mem_cgroup depended
> > > upon two flags also; and very tricky when pages were liable to shift
> > > underneath you from one LRU to another, as flags came and went.
> > > So we already eliminated PageCgroupAcctLRU here.
> > > 
> > 
> > Okay, Hm, do you see performance improvement by moving locks ?
> 
> I was expecting someone to ask that question!  I'm not up-to-date
> on it, it's one of the things I have to get help to gather before
> sending in the patch series.
> 
> I believe the answer is that we saw some improvement on some tests,
> but not so much as to make a hugely compelling case for the change.
> But by that time we'd invested a lot of testing in the memcg locking,
> and little in the original zone locking, so went with the memcg
> locking anyway.
> 
> We'll get more results and hope to show a stronger case for it now.
> But our results will probably have to be based on in-house kernels,
> with a lot of the "infrastructure" mods already in place, to allow
> an easy build-time switch between zone locking and memcg locking.
> 
> That won't be such a fair test if the "infrastructure" mods are
> themselves detrimental (I believe not).  It would be better to
> compare, say, 3.2.0-next against 3.2.0-next plus our patches -
> but my own (quad) machines for testing upstream kernels won't
> be big enough to show much of interest.  I'm rather hoping
> someone will be interested enough to try on something beefier.
> 

Hmm, at first glance at the patch, it seems far complicated than
I expected and added much checks and hooks to lru path...

> > > 
> > > However, I've hardly begun splitting the changes up into a series:
> > > had intended to do so last week, but day followed day...  If you'd
> > > like to see the unpolished uncommented rollup, I can post that.
> > > 
> > 
> > please.
> > Anyway, I'll post my own again as output even if I stop my work there.
> 
> Okay, here it is: my usual mix of cleanup and functional changes.
> There's work by Ying and others in here - will apportion authorship
> more fairly when splitting.  If you're looking through it at all,
> the place to start would be memcontrol.c's lock_page_lru_irqsave().
> 

Thank you. This seems inetersting patch. Hmm...what I think of now is..
In most case, pages are newly allocated and charged ,and then, added to LRU.
pc->mem_cgroup never changes while pages are on LRU.

I have a fix for corner cases as to do

	1. lock lru
	2. remove-page-from-lru
	3. overwrite pc->mem_cgroup
	4. add page to lru again
	5. unlock lru

And blindly believe pc->mem_cgroup regardless of PCG_USED bit at LRU handling.

Hm, per-zone-per-memcg lru locking is much easier if
 - we igonore PCG_USED bit at lru handling
 - we never overwrite pc->mem_cgroup if the page is on LRU.
 - if page may be added to LRU by pagevec etc.. while we overwrite
   pc->mem_cgroup, we always take lru_lock. This is our corner case.

isn't it ? I posted a series of patch. I'm glad if you give me a
quick review.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>