On Wed, Jul 29, 2015 at 04:20:59PM -0700, Mike Kravetz wrote: > On 07/29/2015 12:08 PM, David Rientjes wrote: > >On Tue, 28 Jul 2015, Jörn Engel wrote: > > > >>Well, we definitely need something. Having a 100GB process show 3GB of > >>rss is not very useful. How would we notice a memory leak if it only > >>affects hugepages, for example? > >> > > > >Since the hugetlb pool is a global resource, it would also be helpful to > >determine if a process is mapping more than expected. You can't do that > >just by adding a huge rss metric, however: if you have 2MB and 1GB > >hugepages configured you wouldn't know if a process was mapping 512 2MB > >hugepages or 1 1GB hugepage. > > > >That's the purpose of hugetlb_cgroup, after all, and it supports usage > >counters for all hstates. The test could be converted to use that to > >measure usage if configured in the kernel. > > > >Beyond that, I'm not sure how a per-hstate rss metric would be exported to > >userspace in a clean way and other ways of obtaining the same data are > >possible with hugetlb_cgroup. I'm not sure how successful you'd be in > >arguing that we need separate rss counters for it. > > If I want to track hugetlb usage on a per-task basis, do I then need to > create one cgroup per task? > > For example, suppose I have many tasks using hugetlb and the global pool > is getting low on free pages. It might be useful to know which tasks are > using hugetlb pages, and how many they are using. > > I don't actually have this need (I think), but it appears to be what > Jörn is asking for. One possible way to get hugetlb metric in per-task basis is to walk page table via /proc/pid/pagemap, and counting page flags for each mapped page (we can easily do this with tools/vm/page-types.c like "page-types -p <PID> -b huge"). This is obviously slower than just storing the counter as in-kernel data and just exporting it, but might be useful in some situation. Thanks, Naoya Horiguchi��.n������g����a����&ޖ)���)��h���&������梷�����Ǟ�m������)������^�����������v���O��zf������