(2013/05/24 10:11), Atsushi Kumagai wrote: > On Wed, 15 May 2013 13:43:59 -0500 > Cliff Wickman <cpw at sgi.com> wrote: >> /* >> * The bitmap buffer is not dirty, and it is not necessary >> * to write out it. >> @@ -4345,7 +4386,7 @@ __exclude_unnecessary_pages(unsigned lon >> * Exclude the data page of the user process. >> */ >> else if ((info->dump_level & DL_EXCLUDE_USER_DATA) >> - && isAnon(mapping)) { >> + && (isAnon(mapping) || isHugePage(flags))) { >> if (clear_bit_on_2nd_bitmap_for_kernel(pfn)) >> pfn_user++; > > This code will filter out both transparent huge pages and hugetlbfs pages. > But I think hugetlbfs pages shouldn't be filtered out as anonymous pages > because anonymous pages and hugetlbfs pages are certainly different kind > of pages. > > Therefore, I plan to add a new dump level to filter out hugetlbfs pages. > My basic idea is like below: > > > * Exclude the data page of the user process. > */ > else if ((info->dump_level & DL_EXCLUDE_USER_DATA) > && isAnon(mapping)) { > if (clear_bit_on_2nd_bitmap_for_kernel(pfn)) > pfn_user++; > + > + if (isPageHead(flags)) { > + int i, nr_pages = 1 << compound_order; > + > + for (i = 1; i < nr_pages; ++i) { > + if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) > + pfn_user++; > + } > + pfn += nr_pages - 2; > + mem_map += (nr_pages - 1) * SIZE(page); > + } This seems correct. We don't know when and where page table gets remapped by huge pages transparently. If user-space and transparent huge pages are filtered separately, data larger than size of huge table, 2MiB or 1GiB on x86, can be corrupted. Consider the case that 2MiB + 4KiB data is allocated on 2MiB boundary and only the first 2MiB area is remapped by THP. We should filter them together. > + } > + /* > + * Exclude the hugetlbfs pages. > + */ > + else if ((info->dump_level & DL_EXCLUDE_HUGETLBFS_DATA) > + && (!isAnon(mapping) && isPageHead(flags))) { > + int i, nr_pages = 1 << compound_order; > + > + for (i = 0; i < nr_pages; ++i) { > + if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) > + pfn_hugetlb++; > + } > + pfn += nr_pages - 1; > + mem_map += (nr_pages - 1) * SIZE(page); > } > /* > * Exclude the hwpoison page. > hugetlbfs has no such risk, so this seems basically OK to me. Also, hugetlbfs pages are "user data". They should get filtered when user data dump level is specified. So how about: Dump | zero without with user free hugetlbfs Level | page private private data page page -------+----------------------------------------------------- 0 | 1 | X 2 | X 4 | X X 8 | X X 16 | X 32 | X 63 | X X X X X X On dump level 8, hugetlbfs pages also get filtered. > > But before that, I would like to hear the opinions about the new dump level > to make sure that the level is worthy or not. > -- Thanks. HATAYAMA, Daisuke