On Tue, 30 Nov 2010 12:40:16 -0800 Ying Han <yinghan@xxxxxxxxxx> wrote: > On Tue, Nov 30, 2010 at 12:54 AM, KAMEZAWA Hiroyuki > <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > > On Tue, 30 Nov 2010 17:27:10 +0900 > > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > > > >> On Tue, 30 Nov 2010 17:15:37 +0900 > >> Minchan Kim <minchan.kim@xxxxxxxxx> wrote: > >> > >> > Ideally, I hope we unify global and memcg of kswapd for easy > >> > maintainance if it's not a big problem. > >> > When we make patches about lru pages, we always have to consider what > >> > I should do for memcg. > >> > And when we review patches, we also should consider what the patch is > >> > missing for memcg. > >> > It makes maintainance cost big. Of course, if memcg maintainers is > >> > involved with all patches, it's no problem as it is. > >> > > >> I know it's not. But thread control of kswapd will not have much merging point. > >> And balance_pgdat() is fully replaced in patch/3. The effort for merging seems > >> not big. > > I intended to separate out the logic of per-memcg kswapd logics and > not having it > interfere with existing code. This should help for merging. > yes. > >> > > > > kswapd's balance_pgdat() is for following > > Â- reclaim pages within a node. > > Â- balancing zones in a pgdat. > > > > memcg's background reclaim needs followings. > > Â- reclaim pages within a memcg > > Â- reclaim pages from arbitrary zones, if it's fair, it's good. > > Â ÂBut it's not important from which zone the pages are reclaimed from. > > Â Â(I'm not sure we can select "the oldest" pages from divided LRU.) > > The current implementation is simple, which it iterates all the nodes > and reclaims pages from the per-memcg-per-zone LRU. As long as the > wmarks is ok, the kswapd is done. Meanwhile, in order to not wasting > cputime on "unreclaimable: nodes ( a node is unreclaimable if all the > zones are unreclaimable), I used the nodemask to record that from the > last scan, and the bit is reset as long as a page is returned back. > This is a similar logic used in the global kswapd. > > A potential improvement is to remember the last node we reclaimed > from, and starting from the next node for the next kswapd wake_up. > This avoids the case all the memcg kswapds are reclaiming from the > small node ids on large numa machines. > Yes, that's helpful. > > > > Then, merging will put 2 _very_ different functionalities into 1 function. > > Agree. > > > > > So, I thought it's simpler to implement > > > > Â1. a victim node selector (This algorithm will never be in kswapd.) > > Yeah, or round robin as I replied above ? > I think it's good to have. > > Â2. call _existing_ try_to_free_pages_mem_cgroup() with node local zonelist. > > ÂSharing is enough. > > That will in turn use direct reclaim logic which has no notion of wmarks. > do { node = select_victim_node(); do_try_to_free_pages_mem_cgroup(node); check watermark } or If we need to check priority at el, your new balance_pgdat_mem_cgroup() will be good. > > kswapd stop/go routine may be able to be shared. But this patch itself seems not > > very good to me. > This looks feasible change, I will double check with it. Thanks. Regards, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>