Hi, On Mon, Mar 18, 2013 at 8:29 AM, Dave Anderson <anderson@xxxxxxxxxx> wrote: > By classification, do you mean which bit in the filtering option > of makedumpfile? Exactly. >> Per your request: >> >> > crash> struct page 0xffffea001cdad420 >> > struct struct page { >> > flags = 0x200000000000000, [...] > OK, looks like a page struct (most likely)... I was already pretty sure. Confirmed. >> > crash> kmem -p | tail >> >> OK, here's mine, along with the closest page numbers: >> >> > PAGE PHYSICAL MAPPING INDEX CNT FLAGS >> > [...] >> > ffffea64e939b6f0 1cc4b7fff000 0 0 0 0 >> <<fin>> > > Wow, that system has physical memory installed at an unusually high > physical address location, i.e., where 1cc4b7fff000 is up around > 28 terabytes? That seems large to me too, by about a factor of 10. It _is_ a largish system. > I'd be interested in seeing a dump of "kmem -n". In your case the output > is probably huge, but the top part would reflect the physical memory layout, NODE SIZE PGLIST_DATA BOOTMEM_DATA NODE_ZONES 0 8912880 ffff88087fffb000 ---- ffff88087fffb000 ffff88087fffb980 ffff88087fffc300 ffff88087fffcc80 MEM_MAP START_PADDR START_MAPNR ffffea0000000380 10000 16 ZONE NAME SIZE MEM_MAP START_PADDR START_MAPNR 0 DMA 4080 ffffea0000000380 10000 16 1 DMA32 1044480 ffffea0000038000 1000000 4096 2 Normal 7864320 ffffea0003800000 100000000 1048576 3 Movable 0 0 0 0 ------------------------------------------------------------------- NODE SIZE PGLIST_DATA BOOTMEM_DATA NODE_ZONES 1 8388608 ffff88107fffa040 ---- ffff88107fffa040 ffff88107fffa9c0 ffff88107fffb340 ffff88107fffbcc0 MEM_MAP START_PADDR START_MAPNR ffffffffffffffff 880000000 8912896 ZONE NAME SIZE MEM_MAP START_PADDR START_MAPNR 0 DMA 0 0 0 0 1 DMA32 0 0 0 0 2 Normal 8388608 0 880000000 8912896 3 Movable 0 0 0 0 NR SECTION CODED_MEM_MAP MEM_MAP PFN 0 ffff88087fffa000 ffffea0000000000 ffffea0000000000 0 1 ffff88087fffa020 ffffea0000000000 ffffea00001c0000 32768 2 ffff88087fffa040 ffffea0000000000 ffffea0000380000 65536 [...] 130 ffff88107fff9040 ffffea0000000000 ffffea000e380000 4259840 131 ffff88107fff9060 ffffea0000000000 ffffea000e540000 4292608 132096 ffff880838574558 ffff881038105798 ffff8848a8105798 4328521728 132098 ffff880838574598 ffff880837ed2c00 ffff8840a8252c00 4328587264 [...] 237504 ffff8810369d2f40 ffff8810369d2f40 ffff8875af9d2f40 7782531072 237505 ffff8810369d2f60 1a48b64 657ac08b64 7782563840 237506 ffff8810369d2f80 3686dc30 65afbedc30 7782596608 237507 ffff8810369d2fa0 ffff881033219740 ffff8875ac759740 7782629376 kmem: page excluded: kernel virtual address: ffff8810369d3000 type: "memory section" > So your target page structure should "fit" into one of the > sections above, where the starting MEM_MAP address of each > section should have a contiguous array of page structs that > reference the array of physical pages starting at the "PFN" > value. Those MEM_MAP addresses are typically increasing in > value with each section, but I believe that I have seen cases > where they are not. And they shouldn't have to be, each section > has a base vmemmap address for some number of PFN/physical-pages. OK. That's a bit confusing for me. > Anyway, it does looks like a page structure, and the page structure pointer > itself is translatable. The problem at hand is that the physical address > that the page structure refers to is not being determined because the page > structure address itself is not being recognized by is_page_ptr() as being > part of the sparsemem infrastructure. The "if IS_SPARSEMEM()" section at > the top of is_page_ptr() is returning FALSE. > > That being said, from your target page structure address and the "kmem -n" > output, you could presumably calculate the associated physical address. If the kmem -n output didn't seem to skip over the address of interest.... >> The memory in question is probably not in the dump, but I don't know how >> to specify that it be added to the dump without knowing how the memory >> is characterized. > > The actual physical page that is referenced by your target page structure > is in the dumpfile should not affect the is_page_ptr() function. That > should work regardless. I think it is a good guess that the data I really want are not in the dump: # strings cdump-0c0s6n3 |grep -E 'Process (entered|leaving)' # strings cdump-0c2s6n3 |grep -E 'Process (entered|leaving)' # strings ../mrp752/sp1-fulldbg/dk.data | \ # strings ../mrp752/sp1-fulldbg/dump.c0-0c1s0n0 | \ > grep -E 'Process (entered|leaving)'|sort |uniq -c 311804 Process entered 1 Process enteredgot mutex: 2 Process enteredpage@ 129991 Process leaving [...] The "cdump-0c0s6n3" and "cdump-0c2s6n3" files are from the release at issue, and the ../mrp752/sp1-fulldbg/dump.c0-0c1s0n0 dump is from the SLES-11 SP1 release. As you can see, there should be many thousands of matching strings in the dump files. Since there is not, ... So: what physical pages are missing and why are the missing? With those two questions resolved, we can fix the dump specification to include the missing pages. Thank you again. - Bruce -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility