On 2013/11/22 16:18:20, kexec <kexec-bounces at lists.infradead.org> wrote: > (2013/11/07 9:54), HATAYAMA Daisuke wrote: > > (2013/11/06 11:21), Atsushi Kumagai wrote: > >> (2013/11/06 5:27), Vivek Goyal wrote: > >>> On Tue, Nov 05, 2013 at 09:45:32PM +0800, Jingbai Ma wrote: > >>>> This patch set intend to exclude unnecessary hugepages from vmcore dump file. > >>>> > >>>> This patch requires the kernel patch to export necessary data structures into > >>>> vmcore: "kexec: export hugepage data structure into vmcoreinfo" > >>>> http://lists.infradead.org/pipermail/kexec/2013-November/009997.html > >>>> > >>>> This patch introduce two new dump levels 32 and 64 to exclude all unused and > >>>> active hugepages. The level to exclude all unnecessary pages will be 127 now. > >>> > >>> Interesting. Why hugepages should be treated any differentely than normal > >>> pages? > >>> > >>> If user asked to filter out free page, then it should be filtered and > >>> it should not matter whether it is a huge page or not? > >> > >> I'm making a RFC patch of hugepages filtering based on such policy. > >> > >> I attach the prototype version. > >> It's able to filter out also THPs, and suitable for cyclic processing > >> because it depends on mem_map and looking up it can be divided into > >> cycles. This is the same idea as page_is_buddy(). > >> > >> So I think it's better. > >> > > > >> @@ -4506,14 +4583,49 @@ __exclude_unnecessary_pages(unsigned long mem_map, > >> && !isAnon(mapping)) { > >> if (clear_bit_on_2nd_bitmap_for_kernel(pfn)) > >> pfn_cache_private++; > >> + /* > >> + * NOTE: If THP for cache is introduced, the check for > >> + * compound pages is needed here. > >> + */ > >> } > >> /* > >> * Exclude the data page of the user process. > >> */ > >> - else if ((info->dump_level & DL_EXCLUDE_USER_DATA) > >> - && isAnon(mapping)) { > >> - if (clear_bit_on_2nd_bitmap_for_kernel(pfn)) > >> - pfn_user++; > >> + else if (info->dump_level & DL_EXCLUDE_USER_DATA) { > >> + /* > >> + * Exclude the anonnymous pages as user pages. > >> + */ > >> + if (isAnon(mapping)) { > >> + if (clear_bit_on_2nd_bitmap_for_kernel(pfn)) > >> + pfn_user++; > >> + > >> + /* > >> + * Check the compound page > >> + */ > >> + if (page_is_hugepage(flags) && compound_order > 0) { > >> + int i, nr_pages = 1 << compound_order; > >> + > >> + for (i = 1; i < nr_pages; ++i) { > >> + if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) > >> + pfn_user++; > >> + } > >> + pfn += nr_pages - 2; > >> + mem_map += (nr_pages - 1) * SIZE(page); > >> + } > >> + } > >> + /* > >> + * Exclude the hugetlbfs pages as user pages. > >> + */ > >> + else if (hugetlb_dtor == SYMBOL(free_huge_page)) { > >> + int i, nr_pages = 1 << compound_order; > >> + > >> + for (i = 0; i < nr_pages; ++i) { > >> + if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) > >> + pfn_user++; > >> + } > >> + pfn += nr_pages - 1; > >> + mem_map += (nr_pages - 1) * SIZE(page); > >> + } > >> } > >> /* > >> * Exclude the hwpoison page. > > > > I'm concerned about the case that filtering is not performed to part of mem_map > > entries not belonging to the current cyclic range. > > > > If maximum value of compound_order is larger than maximum value of > > CONFIG_FORCE_MAX_ZONEORDER, which makedumpfile obtains by ARRAY_LENGTH(zone.free_area), > > it's necessary to align info->bufsize_cyclic with larger one in > > check_cyclic_buffer_overrun(). > > > > ping, in case you overlooked this... Sorry for the delayed response, I prioritize the release of v1.5.5 now. Thanks for your advice, check_cyclic_buffer_overrun() should be fixed as you said. In addition, I'm considering other way to address such case, that is to bring the number of "overflowed pages" to the next cycle and exclude them at the top of __exclude_unnecessary_pages() like below: /* * The pages which should be excluded still remain. */ if (remainder >= 1) { int i; unsigned long tmp; for (i = 0; i < remainder; ++i) { if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) { pfn_user++; tmp++; } } pfn += tmp; remainder -= tmp; mem_map += (tmp - 1) * SIZE(page); continue; } If this way works well, then aligning info->buf_size_cyclic will be unnecessary. Thanks Atsushi Kumagai > -- > Thanks. > HATAYAMA, Daisuke > > > _______________________________________________ > kexec mailing list > kexec at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec >