----- "Dave Anderson" <anderson@xxxxxxxxxx> wrote: > Somewhere between the RHEL5 (2.6.18-based) and RHEL6 timeframe, > the ppc64 architecture has started using a virtual memmap scheme > for the arrays of page structures used to describe/handle > each physical page of memory. ... [ snip ] ... > So my speculation (guess?) is that the ppc64.c ppc64_vtop() > function needs updating to properly translate these addresses. > > Since the ppc64 stuff in the crash utility was written by, and > has been maintained by IBM (and since I am ppc64-challenged), > can you guys take a look at what needs to be done? [ sound of crickets... ] Well that request apparently fell on deaf ears... Here's my understanding of the situation. In 2.6.26 the ppc64 architecture started using a new kernel virtual memory region to map the kernel's page structure array(s), so that now there are three kernel virtual memory regions: KERNEL 0xc000000000000000 VMALLOC 0xd000000000000000 VMEMMAP 0xf000000000000000 The KERNEL region is the unity-mapped region, where the underlying physical address can be determined by manipulating the virtual address itself. The VMALLOC region requires a page-table walk-through to find the underlying physical address in a PTE. The new VMEMMAP region is mapped in ppc64 firmware, where a physical address of a given size is mapped to a VMEMMAP virtual address. So for example, the page structure for physical page 0 is at VMEMMAP address 0xf000000000000000, the page for physical page 1 is at f000000000000068, and so on. Once mapped in the firmware TLB (?) the virtual-to-physical translation is done automatically while running in kernel mode. The problem is that the physical-to-vmemmap address/size mapping information is not stored in the kernel proper, so there is no way for the crash utility to make the translation. That being the case, any crash command that needs to read the contents of any page structure will fail. The kernel mapping is performed here in 2.6.26 through 2.6.31: int __meminit vmemmap_populate(struct page *start_page, unsigned long nr_pages, int node) { unsigned long start = (unsigned long)start_page; unsigned long end = (unsigned long)(start_page + nr_pages); unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift; /* Align to the page size of the linear mapping. */ start = _ALIGN_DOWN(start, page_size); for (; start < end; start += page_size) { int mapped; void *p; if (vmemmap_populated(start, page_size)) continue; p = vmemmap_alloc_block(page_size, node); if (!p) return -ENOMEM; pr_debug("vmemmap %08lx allocated at %p, physical %08lx.\n", start, p, __pa(p)); mapped = htab_bolt_mapping(start, start + page_size, __pa(p), pgprot_val(PAGE_KERNEL), mmu_vmemmap_psize, mmu_kernel_ssize); BUG_ON(mapped < 0); } return 0; } So if the pr_debug() statement is turned on, it shows on my test system: vmemmap f000000000000000 allocated at c000000003000000, physical 03000000 This would make for an extremely simple virtual-to-physical translation for the crash utility, but note that neither the unity-mapped virtual address of 0xc000000003000000 nor its associated physical address of 0x3000000 are stored anywhere, since "p" is a stack variable. The htab_bolt_mapping() function does not store the mapping information in the kernel either, it just uses temporary stack variables before calling the ppc_md.hpte_insert() function which eventually leads to a machine-dependent (directly to firmware) function. So unless I'm missing something, nowhere along the vmemmap call-chain are the VTOP address/size particulars stored anywhere -- say for example, in a /proc/iomem-like "resource" data structure. (FWIW, I note that in 2.6.32, CONFIG_PPC_BOOK3E arches still use the normal page tables to map the memmap array(s). I don't know whether BOOK3E arch is the most common or not...) In any case, not being able to read the page structure contents has a significant effect on the crash utility. This is about the only thing that can be done for these kernels, where a warning gets printed during initialization, and any command that attempts to read a page structure will subsequently fail: # crash vmlinux vmcore crash 4.1.2p1 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc64-unknown-linux-gnu"... WARNING: cannot translate vmemmap kernel virtual addresses: commands requiring page structure contents will fail KERNEL: vmlinux DUMPFILE: vmcore CPUS: 2 DATE: Thu Dec 10 05:40:35 2009 UPTIME: 21:44:59 LOAD AVERAGE: 0.11, 0.03, 0.01 TASKS: 196 NODENAME: ibm-js20-04.lab.bos.redhat.com RELEASE: 2.6.31-38.el6.ppc64 VERSION: #1 SMP Sun Nov 22 08:15:30 EST 2009 MACHINE: ppc64 (unknown Mhz) MEMORY: 2 GB PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log for details) PID: 10656 COMMAND: "runtest.sh" TASK: c000000072156420 [THREAD_INFO: c000000072058000] CPU: 0 STATE: TASK_RUNNING (PANIC) crash> kmem -i kmem: cannot translate vmemmap address: f000000000000000 crash> kmem -p PAGE PHYSICAL MAPPING INDEX CNT FLAGS kmem: cannot translate vmemmap address: f000000000000000 crash> kmem -s CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE kmem: cannot translate vmemmap address: f00000000030db44 crash> Can any of the IBM engineers on this list (or any ppc64 user) confirm my findings? Maybe I'm missing something, but I don't see it. And if you agree, perhaps you can work on an upstream solution to store the vmemmap-to-physical data information? Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility