----- Original Message ----- > Hi, > > On Mon, Mar 18, 2013 at 8:29 AM, Dave Anderson <anderson@xxxxxxxxxx> > wrote: > > By classification, do you mean which bit in the filtering option > > of makedumpfile? > > Exactly. > > >> Per your request: > >> > >> > crash> struct page 0xffffea001cdad420 > >> > struct struct page { > >> > flags = 0x200000000000000, > [...] > > OK, looks like a page struct (most likely)... > I was already pretty sure. Confirmed. > > >> > crash> kmem -p | tail > >> > >> OK, here's mine, along with the closest page numbers: > >> > >> > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > >> > [...] > > >> > ffffea64e939b6f0 1cc4b7fff000 0 0 0 0 > >> <<fin>> > > > > Wow, that system has physical memory installed at an unusually high > > physical address location, i.e., where 1cc4b7fff000 is up around > > 28 terabytes? > > That seems large to me too, by about a factor of 10. > It _is_ a largish system. What does the initial system banner (or the "sys" command) show? > > > I'd be interested in seeing a dump of "kmem -n". In your case the output > > is probably huge, but the top part would reflect the physical > > memory layout, > > NODE SIZE PGLIST_DATA BOOTMEM_DATA NODE_ZONES > 0 8912880 ffff88087fffb000 ---- ffff88087fffb000 > ffff88087fffb980 > ffff88087fffc300 > ffff88087fffcc80 > MEM_MAP START_PADDR START_MAPNR > ffffea0000000380 10000 16 > > ZONE NAME SIZE MEM_MAP START_PADDR START_MAPNR > 0 DMA 4080 ffffea0000000380 10000 16 > 1 DMA32 1044480 ffffea0000038000 1000000 4096 > 2 Normal 7864320 ffffea0003800000 100000000 1048576 > 3 Movable 0 0 0 0 > > ------------------------------------------------------------------- > > NODE SIZE PGLIST_DATA BOOTMEM_DATA NODE_ZONES > 1 8388608 ffff88107fffa040 ---- ffff88107fffa040 > ffff88107fffa9c0 > ffff88107fffb340 > ffff88107fffbcc0 > MEM_MAP START_PADDR START_MAPNR > ffffffffffffffff 880000000 8912896 > > ZONE NAME SIZE MEM_MAP START_PADDR START_MAPNR > 0 DMA 0 0 0 0 > 1 DMA32 0 0 0 0 > 2 Normal 8388608 0 880000000 8912896 > 3 Movable 0 0 0 0 At first I didnn't understand how there could be a MEM_MAP of "0" for the NODE 1 physical memory section starting at 34GB (880000000). It indicates that there are 8388608 pages (32GB) starting at 880000000. So the highest physical address would be 0x1080000000 (66GB), which would be a max_pfn value of 0x1080000000 / 4k, or 17301504 decimal. But after section 131, the PFN values start at 4328521728 -- which is 16512GB (~16.5 TB). So clearly the section data is being misinterpreted, and because of that phys_to_page() fails to find a MEM_MAP address for a physical address of 880000000 (i.e. a pfn of 8912896) because section data skips from a PFN of 429268 to the bizarre 4328521728: > > NR SECTION CODED_MEM_MAP MEM_MAP PFN > 0 ffff88087fffa000 ffffea0000000000 ffffea0000000000 0 > 1 ffff88087fffa020 ffffea0000000000 ffffea00001c0000 32768 > 2 ffff88087fffa040 ffffea0000000000 ffffea0000380000 65536 > [...] > 130 ffff88107fff9040 ffffea0000000000 ffffea000e380000 4259840 > 131 ffff88107fff9060 ffffea0000000000 ffffea000e540000 4292608 > 132096 ffff880838574558 ffff881038105798 ffff8848a8105798 4328521728 > 132098 ffff880838574598 ffff880837ed2c00 ffff8840a8252c00 4328587264 > [...] > 237504 ffff8810369d2f40 ffff8810369d2f40 ffff8875af9d2f40 7782531072 > 237505 ffff8810369d2f60 1a48b64 657ac08b64 7782563840 > 237506 ffff8810369d2f80 3686dc30 65afbedc30 7782596608 > 237507 ffff8810369d2fa0 ffff881033219740 ffff8875ac759740 7782629376 > kmem: page excluded: kernel virtual address: ffff8810369d3000 type: "memory section" > > > So your target page structure should "fit" into one of the > > sections above, where the starting MEM_MAP address of each > > section should have a contiguous array of page structs that > > reference the array of physical pages starting at the "PFN" > > value. Those MEM_MAP addresses are typically increasing in > > value with each section, but I believe that I have seen cases > > where they are not. And they shouldn't have to be, each section > > has a base vmemmap address for some number of PFN/physical-pages. > > OK. That's a bit confusing for me. So again, the output with the full kmem -n display contains bizarre values after section 131, causing it to go off into the weeds: ... 127 ffff88087fffafe0 ffffea0000000000 ffffea000de40000 4161536 (ok) 128 ffff88107fff9000 ffffea0000000000 ffffea000e000000 4194304 (ok) 129 ffff88107fff9020 ffffea0000000000 ffffea000e1c0000 4227072 (ok) 130 ffff88107fff9040 ffffea0000000000 ffffea000e380000 4259840 (ok) 131 ffff88107fff9060 ffffea0000000000 ffffea000e540000 4292608 (ok) 132096 ffff880838574558 ffff881038105798 ffff8848a8105798 4328521728 (bogus from here onward...) 132098 ffff880838574598 ffff880837ed2c00 ffff8840a8252c00 4328587264 132099 ffff8808385745b8 ffff880835850400 ffff8840a5d90400 4328620032 132100 ffff8808385745d8 ffff8810342e1c00 ffff8848a49e1c00 4328652800 132101 ffff8808385745f8 ffff8810342e2c00 ffff8848a4ba2c00 4328685568 132102 ffff880838574618 ffff880833a52000 ffff8840a44d2000 4328718336 132103 ffff880838574638 ffff8808354c0c00 ffff8840a6100c00 4328751104 132104 ffff880838574658 ffff8810342e3c00 ffff8848a50e3c00 4328783872 132105 ffff880838574678 ffff8810342e4c00 ffff8848a52a4c00 4328816640 132110 ffff880838574718 20 3871880020 4328980480 132112 ffff880838574758 ffff881037fa3718 ffff8848a9ba3718 4329046016 132114 ffff880838574798 ffff880833a13c00 ffff8840a5993c00 4329111552 132115 ffff8808385747b8 ffff8808386a0800 ffff8840aa7e0800 4329144320 ... So clearly crash is mishandling the memory setup being presented to it. But I have *no* idea what the problem is. > > > Anyway, it does looks like a page structure, and the page structure pointer > > itself is translatable. The problem at hand is that the physical address > > that the page structure refers to is not being determined because the page > > structure address itself is not being recognized by is_page_ptr() as being > > part of the sparsemem infrastructure. The "if IS_SPARSEMEM()" section at > > the top of is_page_ptr() is returning FALSE. > > > > That being said, from your target page structure address and the "kmem -n" > > output, you could presumably calculate the associated physical address. > > If the kmem -n output didn't seem to skip over the address of > interest.... Right, it would walk through all of the sections from obviously misinterpreted section data above, and would not find your target page. After section 131, the MEM_MAP addresses shown are not even in the vmemmap virtual range, which starts at ffffea0000000000. > > >> The memory in question is probably not in the dump, but I don't know how > >> to specify that it be added to the dump without knowing how the memory > >> is characterized. > > > > The actual physical page that is referenced by your target page structure > > is in the dumpfile should not affect the is_page_ptr() function. That > > should work regardless. > > I think it is a good guess that the data I really want are not in the dump: > > # strings cdump-0c0s6n3 |grep -E 'Process (entered|leaving)' > # strings cdump-0c2s6n3 |grep -E 'Process (entered|leaving)' > # strings ../mrp752/sp1-fulldbg/dk.data | \ > # strings ../mrp752/sp1-fulldbg/dump.c0-0c1s0n0 | \ > > grep -E 'Process (entered|leaving)'|sort |uniq -c > 311804 Process entered > 1 Process enteredgot mutex: > 2 Process enteredpage@ > 129991 Process leaving > [...] > > The "cdump-0c0s6n3" and "cdump-0c2s6n3" files are from the release at issue, > and the ../mrp752/sp1-fulldbg/dump.c0-0c1s0n0 dump is from the SLES-11 SP1 > release. As you can see, there should be many thousands of matching strings > in the dump files. Since there is not, ... > > So: what physical pages are missing and why are the missing? > With those two questions resolved, we can fix the dump specification > to include the missing pages. I don't know how SUSE sets up their dumping operation. I presume that they use makedumpfile to post-process/filter /proc/vmcore into the dumpfile, and therefore you would need to find out how it got invoked. On RHEL systems, there is an /etc/kdump.conf file which specifies a "core_collector", and if it specifies "makedumpfile", it also shows the exact command line used to invoke it when running against /proc/vmcore in the second kernel. For example, by default we use: core_collector makedumpfile -c --message-level 1 -d 31 and where the makedumpfile(8) (or "makedumpfile --help") will indicate which types of memory will be filtered based upon the "-d <dump_level>" argument. A dump_level of 31 is the most aggressive: dump | zero | cache|cache | user | free level | page | page |private| data | page -------+------+------+-------+------+------ 0 | | | | | 1 | X | | | | 2 | | X | | | 3 | X | X | | | 4 | | X | X | | 5 | X | X | X | | 6 | | X | X | | 7 | X | X | X | | 8 | | | | X | 9 | X | | | X | 10 | | X | | X | 11 | X | X | | X | 12 | | X | X | X | 13 | X | X | X | X | 14 | | X | X | X | 15 | X | X | X | X | 16 | | | | | X 17 | X | | | | X 18 | | X | | | X 19 | X | X | | | X 20 | | X | X | | X 21 | X | X | X | | X 22 | | X | X | | X 23 | X | X | X | | X 24 | | | | X | X 25 | X | | | X | X 26 | | X | | X | X 27 | X | X | | X | X 28 | | X | X | X | X 29 | X | X | X | X | X 30 | | X | X | X | X 31 | X | X | X | X | X You might want to just filter zero-filled-pages and free-pages, which would be a dump-level of 17. Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility