On Tue, Nov 30, 2010 at 12:54 AM, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > On Tue, 30 Nov 2010 17:27:10 +0900 > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > >> On Tue, 30 Nov 2010 17:15:37 +0900 >> Minchan Kim <minchan.kim@xxxxxxxxx> wrote: >> >> > Ideally, I hope we unify global and memcg of kswapd for easy >> > maintainance if it's not a big problem. >> > When we make patches about lru pages, we always have to consider what >> > I should do for memcg. >> > And when we review patches, we also should consider what the patch is >> > missing for memcg. >> > It makes maintainance cost big. Of course, if memcg maintainers is >> > involved with all patches, it's no problem as it is. >> > >> I know it's not. But thread control of kswapd will not have much merging point. >> And balance_pgdat() is fully replaced in patch/3. The effort for merging seems >> not big. I intended to separate out the logic of per-memcg kswapd logics and not having it interfere with existing code. This should help for merging. >> > > kswapd's balance_pgdat() is for following > - reclaim pages within a node. > - balancing zones in a pgdat. > > memcg's background reclaim needs followings. > - reclaim pages within a memcg > - reclaim pages from arbitrary zones, if it's fair, it's good. > But it's not important from which zone the pages are reclaimed from. > (I'm not sure we can select "the oldest" pages from divided LRU.) The current implementation is simple, which it iterates all the nodes and reclaims pages from the per-memcg-per-zone LRU. As long as the wmarks is ok, the kswapd is done. Meanwhile, in order to not wasting cputime on "unreclaimable: nodes ( a node is unreclaimable if all the zones are unreclaimable), I used the nodemask to record that from the last scan, and the bit is reset as long as a page is returned back. This is a similar logic used in the global kswapd. A potential improvement is to remember the last node we reclaimed from, and starting from the next node for the next kswapd wake_up. This avoids the case all the memcg kswapds are reclaiming from the small node ids on large numa machines. > > Then, merging will put 2 _very_ different functionalities into 1 function