On Tue, Aug 04, 2015 at 02:55:30AM +0000, Naoya Horiguchi wrote: > On Wed, Jul 29, 2015 at 04:20:59PM -0700, Mike Kravetz wrote: > > On 07/29/2015 12:08 PM, David Rientjes wrote: > > >On Tue, 28 Jul 2015, Jörn Engel wrote: > > > > > >>Well, we definitely need something. Having a 100GB process show 3GB of > > >>rss is not very useful. How would we notice a memory leak if it only > > >>affects hugepages, for example? > > >> > > > > > >Since the hugetlb pool is a global resource, it would also be helpful to > > >determine if a process is mapping more than expected. You can't do that > > >just by adding a huge rss metric, however: if you have 2MB and 1GB > > >hugepages configured you wouldn't know if a process was mapping 512 2MB > > >hugepages or 1 1GB hugepage. > > > > > >That's the purpose of hugetlb_cgroup, after all, and it supports usage > > >counters for all hstates. The test could be converted to use that to > > >measure usage if configured in the kernel. > > > > > >Beyond that, I'm not sure how a per-hstate rss metric would be exported to > > >userspace in a clean way and other ways of obtaining the same data are > > >possible with hugetlb_cgroup. I'm not sure how successful you'd be in > > >arguing that we need separate rss counters for it. > > > > If I want to track hugetlb usage on a per-task basis, do I then need to > > create one cgroup per task? > > > > For example, suppose I have many tasks using hugetlb and the global pool > > is getting low on free pages. It might be useful to know which tasks are > > using hugetlb pages, and how many they are using. > > > > I don't actually have this need (I think), but it appears to be what > > Jörn is asking for. > > One possible way to get hugetlb metric in per-task basis is to walk page > table via /proc/pid/pagemap, and counting page flags for each mapped page > (we can easily do this with tools/vm/page-types.c like "page-types -p <PID> > -b huge"). This is obviously slower than just storing the counter as > in-kernel data and just exporting it, but might be useful in some situation. BTW, currently smaps doesn't report any meaningful info for vma(VM_HUGETLB). I wrote the following patch, which hopefully is helpful for your purpose. Thanks, Naoya Horiguchi --- From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Subject: [PATCH] smaps: fill missing fields for vma(VM_HUGETLB) Currently smaps reports many zero fields for vma(VM_HUGETLB), which is inconvenient when we want to know per-task or per-vma base hugetlb usage. This patch enables these fields by introducing smaps_hugetlb_range(). before patch: Size: 20480 kB Rss: 0 kB Pss: 0 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 0 kB Referenced: 0 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 2048 kB MMUPageSize: 2048 kB Locked: 0 kB VmFlags: rd wr mr mw me de ht after patch: Size: 20480 kB Rss: 18432 kB Pss: 18432 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 18432 kB Referenced: 18432 kB Anonymous: 18432 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 2048 kB MMUPageSize: 2048 kB Locked: 0 kB VmFlags: rd wr mr mw me de ht Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> --- fs/proc/task_mmu.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index ca1e091881d4..c7218603306d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -610,12 +610,39 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) seq_putc(m, '\n'); } +#ifdef CONFIG_HUGETLB_PAGE +static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct mem_size_stats *mss = walk->private; + struct vm_area_struct *vma = walk->vma; + struct page *page = NULL; + + if (pte_present(*pte)) { + page = vm_normal_page(vma, addr, *pte); + } else if (is_swap_pte(*pte)) { + swp_entry_t swpent = pte_to_swp_entry(*pte); + + if (is_migration_entry(swpent)) + page = migration_entry_to_page(swpent); + } + if (page) + smaps_account(mss, page, huge_page_size(hstate_vma(vma)), + pte_young(*pte), pte_dirty(*pte)); + return 0; +} +#endif /* HUGETLB_PAGE */ + static int show_smap(struct seq_file *m, void *v, int is_pid) { struct vm_area_struct *vma = v; struct mem_size_stats mss; struct mm_walk smaps_walk = { .pmd_entry = smaps_pte_range, +#ifdef CONFIG_HUGETLB_PAGE + .hugetlb_entry = smaps_hugetlb_range, +#endif .mm = vma->vm_mm, .private = &mss, }; -- 2.4.3 ��.n������g����a����&ޖ)���)��h���&������梷�����Ǟ�m������)������^�����������v���O��zf������