----- Original Message ----- > Dave, I confirmed that if I use -d17 for makedumpfile I can then capture a > usable core. > I am on 4.5.0-0.rc3.git3.1 > Thanks > Laurence > > Laurence Oberman > Principal Software Maintenance Engineer > Red Hat Global Support Services OK, good. And actually, you might be able to get away with filtering cache-with-private and cache-without-private pages as well. Looking further at the makedumpfile code, only user-pages check for bit 1 to be set. The two page-cache variants look for it to be 0: /* * Exclude the non-private cache page. */ else if ((info->dump_level & DL_EXCLUDE_CACHE) && (isLRU(flags) || isSwapCache(flags)) && !isPrivate(flags) && !isAnon(mapping)) { pfn_counter = &pfn_cache; } /* * Exclude the cache page whether private or non-private. */ else if ((info->dump_level & DL_EXCLUDE_CACHE_PRI) && (isLRU(flags) || isSwapCache(flags)) && !isAnon(mapping)) { if (isPrivate(flags)) pfn_counter = &pfn_cache_private; else pfn_counter = &pfn_cache; } /* * Exclude the data page of the user process. * - anonymous pages * - hugetlbfs pages */ else if ((info->dump_level & DL_EXCLUDE_USER_DATA) && (isAnon(mapping) || isHugetlb(compound_dtor))) { pfn_counter = &pfn_user; } The isAnon() function looks like this: static inline int isAnon(unsigned long mapping) { return ((unsigned long)mapping & PAGE_MAPPING_ANON) != 0; } Note above that only DL_EXCLUDE_USER_DATA uses isAnon(), whereas the other two use !isAnon(). So if my logic is correct, if you try to filter out page-cache pages as well -- i.e., with "-d23" -- worst case it may result in some pages *not* being filtered. And I'm not even sure of that, given the page->flags checks that go along with it. Dave > > ----- Original Message ----- > From: "Dave Anderson" <anderson at redhat.com> > To: ats-kumagai at wm.jp.nec.com > Cc: kexec at lists.infradead.org, "Discussion list for crash utility usage, > maintenance and development" <crash-utility at redhat.com>, "Joe Lawrence" > <joe.lawrence at stratus.com>, "Laurence Oberman" <loberman at redhat.com> > Sent: Thursday, February 18, 2016 12:05:11 PM > Subject: makedumpfile: 4.5 kernel commit breaks page filtering > > > > Hello Atsushi, > > I've recently had a couple 4.5-era vmcores issues reported to me as crash > bugs > because they generate numerous initialization-time errors of the type: > > crash: page excluded: kernel virtual address: ffff880075459000 type: > "fill_task_struct" > > Initially I thought it was related to this crash-7.1.4 fix that you posted: > > Fix for the handling of dynamically-sized task_struct structures in > Linux 4.2 and later kernels, which contain these commits: > > commit 5aaeb5c01c5b6c0be7b7aadbf3ace9f3a4458c3d > x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and > use it on x86 > commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a > x86/fpu, sched: Dynamically allocate 'struct fpu' > > Without the patch, when running on a filtered kdump dumpfile, it is > possible that error messages like this will be seen when gathering > the tasks running on a system: "crash: page excluded: kernel virtual > address: <task_struct address> type: "fill_task_struct". > (ats-kumagai at wm.jp.nec.com) > > But upon further investigation of a suspect vmcore, there are many other > "page excluded" errors for several other data structure types. Joe Lawrence > of Stratus did some kernel-bisecting, and narrowed it down to this recent > 4.5 commit: > > commit 1c290f642101e64f379e38ea0361d097c08e824d > Author: Kirill A. Shutemov <kirill.shutemov at linux.intel.com> > Date: Fri Jan 15 16:52:07 2016 -0800 > > mm: sanitize page->mapping for tail pages > > We don't define meaning of page->mapping for tail pages. Currently it's > always NULL, which can be inconsistent with head page and potentially > lead to problems. > > Let's poison the pointer to catch all illigal uses. > > page_rmapping(), page_mapping() and page_anon_vma() are changed to look > on head page. > > The only illegal use I've caught so far is __GPF_COMP pages from sound > subsystem, mapped with PTEs. do_shared_fault() is changed to use > page_rmapping() instead of direct access to fault_page->mapping. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com> > Reviewed-by: J??r??me Glisse <jglisse at redhat.com> > Cc: Andrea Arcangeli <aarcange at redhat.com> > Cc: Hugh Dickins <hughd at google.com> > Cc: Dave Hansen <dave.hansen at intel.com> > Cc: Mel Gorman <mgorman at suse.de> > Cc: Rik van Riel <riel at redhat.com> > Cc: Vlastimil Babka <vbabka at suse.cz> > Cc: Christoph Lameter <cl at linux.com> > Cc: Naoya Horiguchi <n-horiguchi at ah.jp.nec.com> > Cc: Steve Capper <steve.capper at linaro.org> > Cc: "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com> > Cc: Johannes Weiner <hannes at cmpxchg.org> > Cc: Michal Hocko <mhocko at suse.cz> > Cc: Jerome Marchand <jmarchan at redhat.com> > Signed-off-by: Andrew Morton <akpm at linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org> > > And related to the above, and the one that affects makedumpfile, is this one: > > commit 822cdd1152265d87fcfc974e06c3b68f762987fd > Author: Kirill A. Shutemov <kirill.shutemov at linux.intel.com> > Date: Fri Jan 15 16:52:03 2016 -0800 > > page-flags: look at head page if the flag is encoded in page->mapping > > PageAnon() and PageKsm() look at lower bits of page->mapping to check if > the page is Anon or KSM. page->mapping can be overloaded in tail pages. > > Let's always look at head page to avoid false-positives. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com> > Cc: Andrea Arcangeli <aarcange at redhat.com> > Cc: Hugh Dickins <hughd at google.com> > Cc: Dave Hansen <dave.hansen at intel.com> > Cc: Mel Gorman <mgorman at suse.de> > Cc: Rik van Riel <riel at redhat.com> > Cc: Vlastimil Babka <vbabka at suse.cz> > Cc: Christoph Lameter <cl at linux.com> > Cc: Naoya Horiguchi <n-horiguchi at ah.jp.nec.com> > Cc: Steve Capper <steve.capper at linaro.org> > Cc: "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com> > Cc: Johannes Weiner <hannes at cmpxchg.org> > Cc: Michal Hocko <mhocko at suse.cz> > Cc: Jerome Marchand <jmarchan at redhat.com> > Cc: J??r??me Glisse <jglisse at redhat.com> > Signed-off-by: Andrew Morton <akpm at linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org> > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 818fa39..190f191 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -176,7 +176,7 @@ static inline int PageCompound(struct page *page) > #define PF_NO_TAIL(page, enforce) ({ \ > VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page); \ > compound_head(page);}) > -#define PF_NO_COMPOUND(page, enforce) ({ > \ > +#define PF_NO_COMPOUND(page, enforce) ({ \ > VM_BUG_ON_PGFLAGS(enforce && PageCompound(page), page); \ > page;}) > > @@ -381,6 +381,7 @@ PAGEFLAG(Idle, idle, PF_ANY) > > static inline int PageAnon(struct page *page) > { > + page = compound_head(page); > return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0; > } > > @@ -393,6 +394,7 @@ static inline int PageAnon(struct page *page) > */ > static inline int PageKsm(struct page *page) > { > + page = compound_head(page); > return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) == > (PAGE_MAPPING_ANON | PAGE_MAPPING_KSM); > } > > Note that PAGE_MAPPING_ANON is now only set in the compound_head page, > so when makedumpfile walks though the pages, it will have to look > at each page's head page for the bit setting. > > As it is now, makedumpfile runs amok filtering pages that still have > stuff left in page->mapping. For example, all of the addresses in > my "filtered.list" input file are those of legitimate kernel data > structures that have been incorrectly filtered because PAGE_MAPPING_ANON > (bit 1) has been left set: > > crash> kmem -p < filtered.list > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > ffffea0011b29040 46ca41000 dead0000ffffffff 0 0 3ffff800000000 > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > ffffea0011b29040 46ca41000 dead0000ffffffff 0 0 3ffff800000000 > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > ffffea0011b29640 46ca59000 dead0000ffffffff 0 0 3ffff800000000 > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > ffffea0011b29640 46ca59000 dead0000ffffffff 0 0 3ffff800000000 > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > ffffea0001d51640 75459000 dead0000ffffffff 0 0 1ffff800000000 > ... > > In earlier kernels, the page->mapping fields above would not have > their PAGE_MAPPING_ANON set. > > Dave > >