On Mon, 17 Jan 2011 20:14:00 +0100 Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > Hello, > > on the MM summit, I would like to talk about the current state of > memory control groups, the features and extensions that are currently > being developed for it, and what their status is. > > I am especially interested in talking about the current runtime memory > overhead memcg comes with (1% of ram) and what we can do to shrink it. > > In comparison to how efficiently struct page is packed, and given that > distro kernels come with memcg enabled per default, I think we should > put a bit more thought into how struct page_cgroup (which exists for > every page in the system as well) is organized. > > I have a patch series that removes the page backpointer from struct > page_cgroup by storing a node ID (or section ID, depending on whether > sparsemem is configured) in the free bits of pc->flags. > > I also plan on replacing the pc->mem_cgroup pointer with an ID > (KAMEZAWA-san has patches for that), and move it to pc->flags too. > Every flag not used means doubling the amount of possible control > groups, so I have patches that get rid of some flags currently > allocated, including PCG_CACHE, PCG_ACCT_LRU, and PCG_MIGRATION. > > [ I meant to send those out much earlier already, but a bug in the > migration rework was not responding to my yelling 'Marco', and now my > changes collide horribly with THP, so it will take another rebase. ] > > The per-memcg dirty accounting work e.g. allocates a bunch of new bits > in pc->flags and I'd like to hash out if this leaves enough room for > the structure packing I described, or whether we can come up with a > different way of tracking state. > I see that there are requests for shrinking page_cgroup. And yes, I think we should do so. I think there are trade-off between performance v.s. memory usage. So, could you show the numbers when we discuss it ? BTW, I think we can... - PCG_ACCT_LRU bit can be dropped.(I think list_empty(&pc->lru) can be used. ROOT cgroup will not be problem.) - pc->mem_cgroup can be replaced with ID. But move it into flags field seems difficult because of races. - pc->page can be replaced with some lookup routine. But Section bit encoding may be something mysterious and look up cost will be problem. - PCG_CACHE bit is a duplicate of information of 'page'. So, we can use PageAnon() - I'm not sure PCG_MIGRATION. It's for avoiding races. Note: we'll need to use 16bits for blkio tracking. Another idea is dynamic allocation of page_cgroup. It may be able to be a help for THP enviroment but will not work well (just adds overhead) against file cache workload. Anwyay, my priority of development for memcg this year is: 1. dirty ratio support. 2. Backgound reclaim (kswapd) 3. blkio tracking. Diet of page_cgroup should be done in step by step. We've seen many level down when some new feature comes to memory cgroup. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>