On Wed, 2010-04-21 at 09:58 -0400, Dave Anderson wrote: > ----- "Pavan Naregundi" <pavan@xxxxxxxxxxxxxxxxxx> wrote: > > > On Tue, 2010-04-20 at 09:14 -0400, Dave Anderson wrote: > > > ----- "Pavan Naregundi" <pavan@xxxxxxxxxxxxxxxxxx> wrote: > > > > > > The cause for seek errors depends upon the type > > > of dumpfile. > > > > > > You didn't mention which type of dumpfile the vmcore > > > is, so I'll presume that it's either an ELF-format > > > kdump or a compressed kdump created by makedumpfile. > > > > > > So presuming that it's a compressed kdump, the seek error > > > most likely comes from here in read_diskdump() in diskdump.c: > > > > > > if ((pfn >= dd->header->max_mapnr) || !page_is_ram(pfn)) > > > return SEEK_ERROR; > > > > > > where the requested physical address pfn values are larger > > > than the max_mapnr value advertised in the header. > > > > > > When you do any "crash -d# ...", the dumpfile header will > > > be dumped first. What does that show? > > > > > > Dave > > > > > > Dave, > > > > Dumpfile is compressed kdump created by makedumpfile. > > > > header shows the following values: > > max_mapnr: 32768 > > block_shift: 16 > > > > Yes. Adding some debug printf's shows me that (pfn >= > > dd->header->max_mapnr) fails. > > > > For example: in the first seek error, > > crash: seek error: kernel virtual address: c0000000af715480 type: > > "kmem_cache buffer" > > > > paddr: af715480 => pfn=44913 > > > > crash -d8 log: http://pastebin.com/qrCvyPfR > > > > Thanks..Pavan > > OK, so the compressed dumpfile has exactly 32768 pages of physical > memory, or exactly 2GB. That being the case, the crash utility > will fail all readmem attempts above that value, and obviously > there is critical data above the artificial 2GB threshold. > > The question at hand is why kdump is creating a truncated dumpfile > with a max_mapnr of 32768: > > (1) makedumpfile determines the "max_mapnr" value based upon the > highest physical address found in any of the PT_LOAD segments > of the /proc/vmcore file on the secondary kernel. > (2) the /proc/vmcore PT_LOAD segments were pre-calculated during > the primary kernel's kdump initialization phase, based upon > the values found in the set of "/proc/device-tree/memory@xxx/reg" > files existing in the primary kernel, where the "xxx" is the > starting physical address of the memory region, and the "reg" > file in that directory contains the size of the memory region. > > For whatever reason, those files showed a maximum of 2GB of > physical memory. (If you do not use makedumpfile, and then do > a "readelf -a" of the resultant vmcore file, you will see > the PT_LOAD segment values.) > > Does the SLES11 vmlinux-2.6.32.10-0.4.99.25.62005-ppc64 kernel > contain this patch?: > > http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=8be8cf5b47f72096e42bf88cc3afff7a942a346c > > I ask because we also have an outstanding bugzilla that exhibits similar > behavior, where an abnormally small ppc64 vmcore file gets created > because there was only a single /proc/device-tree/memory@0 directory > file that showed just a small subset of the total physical memory. > Typically there are many of those "memory@xxx" directories, but in > the failing scenario, there was only one /proc/device-tree/memory@0 > directory. > > Anyway, there's (unproven) speculation that the kernel patch above > is related to the problem. > > In any case, unfortunately, there's nothing can be done from the crash > utility's perspective. > > Dave Thank you Dave. Our SLES11 does not have the above patch you mentioned, but at the same time system is not AMS enabled and CONFIG_CMM is also not set in the config file.. This system also has /proc/device-tree/memory@0 dir only.. Regards..Pavan -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility