Hi Dave, On 03/15/13 07:07, Dave Anderson wrote: >> extension working again. It used to work, but does no more. >> It first calls is_page_ptr(kvaddr, &kpaddr) to convert a virtual >> address into a physical address, and then calls: >> >>> readmem(kpaddr, PHYSADDR, buf, used, >>> "trace page data", RETURN_ON_ERROR) >> >> to fetch the bytes. Updating the release to SLES-11 SP2 causes >> this to now fail. > > So are you saying that it works with an earlier kernel version? Yep. My first guess on this is that there is some different classification of the memory and that the new classification is not selected by the crash dump. >> Help, please? Thank you! > > It is translating the vmemmap'ed kernel address to a physical address > by walking the page tables, and finding it in a 2MB big-page. > If you skip the is_page_ptr() qualifier, does this work, and > if so, does it look like a legitimate page structure?: It is both a qualifier and a translator-to-phys page. I'll have to do some research on how to invoke readmem with the virtual address instead of physical address. Eventually, they must all fold back into crash's memory.c readmem() function. Per your request: > crash> struct page 0xffffea001cdad420 > struct struct page { > flags = 0x200000000000000, > _count = { counter = 0x1 }, > { > _mapcount = { counter = 0xffffffff }, > { inuse = 0xffff, objects = 0xffff } > }, > { > { private = 0x0, mapping = 0x0 }, > ptl = { > { rlock = { raw_lock = { slock = 0x0 } } } > }, > slab = 0x0, > first_page = 0x0 > }, > { > index = 0xffff88067b39a400, > freelist = 0xffff88067b39a400, > pfmemalloc = 0x0 > }, > lru = { > next = 0xdead000000100100, > prev = 0xdead000000200200 > } > } > But the sparsemem stuff doesn't seem to be accepting it as a vmemmap > page struct address. Does "kmem -p" include physical address 0x87afad420? > For example, on my system, the last physical page mapped in the > vmmemap is 21ffff000: > > crash> kmem -p | tail OK, here's mine, along with the closest page numbers: > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > [...] > ffffea000e6ffee8 41fffb000 0 c5600 0 200000000000000 > ffffea000e6fff20 41fffc000 0 c5600 0 200000000000000 > ffffea000e6fff58 41fffd000 0 c5600 0 200000000000000 > ffffea000e6fff90 41fffe000 0 c5600 0 200000000000000 > ffffea000e6fffc8 41ffff000 0 c5600 0 200000000000000 <<no 0xffffea001cdad420 entry, the next line is:>> > ffffea56189f2488 189120000000 0 0 0 0 > ffffea56189f24c0 189120001000 0 0 0 0 > ffffea56189f24f8 189120002000 0 0 0 0 > ffffea56189f2530 189120003000 0 0 0 0 > [...] > ffffea64e939b648 1cc4b7ffc000 0 0 0 0 > ffffea64e939b680 1cc4b7ffd000 0 0 0 0 > ffffea64e939b6b8 1cc4b7ffe000 0 0 0 0 > ffffea64e939b6f0 1cc4b7fff000 0 0 0 0 <<fin>> > Anyway, the first thing that needs to be done is to verify that > the the SECTION_SIZE_BITS and MAX_PHYSMEM_BITS are being setup > correctly. The upstream kernel currently has: > > # define SECTION_SIZE_BITS 27 /* matt - 128 is convenient right now */ > # define MAX_PHYSADDR_BITS 44 > # define MAX_PHYSMEM_BITS 46 That is what linux-3.0.13-0.27 has for x86-64, too. > crash> help -m | grep -e section -e physmem > section_size_bits: 27 > max_physmem_bits: 46 > sections_per_root: 128 > crash> Matches my output. Is there a way to coerce readelf to tell me anything about the crash dump? If you are curious to look at the actual dump, I can tell you how to get it via ftp (offline). The extension is on github: git clone git://github.com/brkorb/lustre-crash-tools.git and cr-ext/lustre-ext.c is the the one. The memory in question is probably not in the dump, but I don't know how to specify that it be added to the dump without knowing how the memory is characterized. Thank you for your help! Regards, Bruce -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility