Re: questions about crash utility

Dave Anderson <anderson@xxxxxxxxxx> · Fri, 18 Jan 2013 10:23:00 -0500 (EST)

----- Original Message -----
> 
> 
> 
> Hi Dave:
> 
> thank you very much for your detail answer, this really helpful.
> please see my inline words. thanks.
> 
> 
> > Date: Thu, 17 Jan 2013 14:17:36 -0500
> > From: anderson@xxxxxxxxxx
> > To: crash-utility@xxxxxxxxxx
> > Subject: Re:  questions about crash utility
> 
> > The fact that crash gets as far as it does at least means that the
> > ELF header you've created was deemed acceptable as an ARM vmcore.
> > However, the error messages re: "cpu_present_mask indicates..." and
> > "cannot determine base kernel version" indicate that the data
> > that was read from the vmcore was clearly not the correct data.
> > 
> > The "cpu_present_mask" value that it read contained too
> > many bits -- presuming that the 32-bit ARM processor is
> > still limited to only 4 cpus. (looks like upstream that
> > CONFIG_NR_CPUS is still 2 in the arch/arm/configs files.)
> > 
> > But more indicative of the wrong data being read is the second
> > "cannot determine base kernel version" message, which was generated
> > after it read the kernel's "init_uts_ns" uts_namespace structure.
> > After reading it, it sees that the "release" string contains
> > non-ASCII data, whereas it should contain the kernel version:
> > 
> > crash> p init_uts_ns
> > init_uts_ns = $3 = {
> > kref = {
> > refcount = {
> > counter = 2
> > }
> > },
> > name = {
> > sysname =
> > "Linux\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > nodename =
> > "phenom-01.lab.bos.redhat.com\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > release =
> > "2.6.32-313.el6.x86_64\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > version = "#1 SMP Thu Sep 27 16:25:19 EDT
> > 2012\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > machine =
> > "x86_64\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> > domainname =
> > "(none)\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
> > }
> > }
> > crash>
> > 
> > So it appears that you're reading data from the wrong
> > locations in the dumpfile. You should be able to verify
> > that by bringing up the crash session with the --minimal
> > flag like this:
> > 
> > $ crash --minimal vmlinux vmcore
> > 
> > That will bypass most of the initialization, including all
> > readmem() calls of the vmcore. Then do this:
> > 
> > crash> rd linux_banner 20
> > ffffffff818000a0: 65762078756e694c 2e33206e6f697372 Linux version
> > 3.
> > ffffffff818000b0: 63662e312d312e35 365f3638782e3731 5.1-1.fc17.x86_6
> > ffffffff818000c0: 626b636f6d282034 69756240646c6975 4(mockbuild@bui
> > ffffffff818000d0: 2e33322d6d76646c 6465662e32786870 ldvm-23.phx2.fed
> > ffffffff818000e0: 656a6f727061726f 202967726f2e7463 oraproject.org)
> > ffffffff818000f0: 7265762063636728 372e34206e6f6973 (gcc version 4.7
> > ffffffff81800100: 303231303220302e 6465522820373035 .0 20120507
> > (Red
> > ffffffff81800110: 372e342074614820 47282029352d302e Hat 4.7.0-5) (G
> > ffffffff81800120: 3123202920294343 75685420504d5320 CC) ) #1 SMP Thu
> > ffffffff81800130: 3120392067754120 2033343a30353a37 Aug 9 17:50:43
> > crash> rd -a linux_banner
> > ffffffff818000a0: Linux version 3.5.1-1.fc17.x86_64 (mockbuild@buildvm-23.phx2
> > ffffffff818000dc: .fedoraproject.org) (gcc version 4.7.0 20120507 (Red Hat 4.7
> > ffffffff81800118: .0-5) (GCC) ) #1 SMP Thu Aug 9 17:50:43 UTC 2012
> > crash>
> > 
> > I'm guessing that you will not see a string starting with "Linux version"
> > with your dumpfile as shown above.
> > 
> > If that's the case, then it's clear that the readmem() function is ultimately
> > reading from the wrong vmcore file offset.
> > 
> > Here's what you can try doing. Taking the linux_banner example above,
> > you can check where in the dumpfile it's reading from by setting the debug
> > flag, before doing a simple read -- like this example on an ARM dumpfile:
> > 
> > crash> set debug 8
> > debug: 8
> > crash> rd linux_banner
> > <addr: c033ea10 count: 1 flag: 488 (KVADDR)>
> > <readmem: c033ea10, KVADDR, "32-bit KVADDR", 4, (FOE), ff94f048>
> > <read_kdump: addr: c033ea10 paddr: 33ea10 cnt: 4>
> > read_netdump: addr: c033ea10 paddr: 33ea10 cnt: 4 offset: 33f088
> > c033ea10: 756e694c Linu
> > crash>
> > 
> > The linux_banner is at virtual address c033ea10 (addr). First it gets translated
> > into physical address 33ea10 (paddr). Then that paddr is translated into the
> > vmcore file offset of 33f088. It lseeks to vmcore file offset 33f088 and
> > reads 4 bytes, which contain "756e694c", or the first 4 bytes of the
> > "Linux version ..." string.
> > 
> > Note that if I subtract the physical address from vmcore file offset
> > I get this:
> > 
> > crash> eval 33f088 - 33ea10
> > hexadecimal: 678
> > decimal: 1656
> > octal: 3170
> > binary: 00000000000000000000011001111000
> > crash>
> > 
> > which would put physical address 0 at a vmcore file offset of 0x678, and
> > therefore implying that that the ELF header comprises the first 0x678 bytes.
> > And looking at the vmcore, that can be verified:
> > 
> 
> yes you are right, here i get the result as below:
> crash> set debug 8
> debug: 8
> crash> rd linux_banner
> <addr: c065a071 count: 1 flag: 488 (KVADDR)>
> <readmem: c065a071, KVADDR, "32-bit KVADDR", 4, (FOE), ffdf297c>
> <read_kdump: addr: c065a071 paddr: 85a071 cnt: 4>
> read_netdump: addr: c065a071 paddr: 85a071 cnt: 4 offset: 65a0e5
> c065a071: 03e59130 0...
> 
> the virtual address is 0xc065a071 , and the physical address is 
> 0x85a071 , and the offset is 0x65a0e5.
> my elf header is 116 bytes long, 0x65a0e5 - 116=0x65A071, which has a
> gap 0x00200000 with the physical address 0x85a071.
> 
> 
> > $ readelf -a vmcore
> > ELF Header:
> > Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
> > Class: ELF32
> > Data: 2's complement, little endian
> > Version: 1 (current)
> > OS/ABI: UNIX - System V
> > ABI Version: 0
> > Type: CORE (Core file)
> > Machine: ARM
> > Version: 0x1
> > Entry point address: 0x0
> > Start of program headers: 52 (bytes into file)
> > Start of section headers: 0 (bytes into file)
> > Flags: 0x0
> > Size of this header: 52 (bytes)
> > Size of program headers: 32 (bytes)
> > Number of program headers: 3
> > Size of section headers: 0 (bytes)
> > Number of section headers: 0
> > Section header string table index: 0
> > 
> > There are no sections in this file.
> > 
> > There are no sections to group in this file.
> > 
> > Program Headers:
> > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
> > NOTE 0x000094 0x00000000 0x004e345c 0x005e4 0x005e4 0
> > LOAD 0x000678 0xc0000000 0x00000000 0x5600000 0x5600000 RWE 0
> > LOAD 0x5600678 0xc5700000 0x05700000 0x100000 0x100000 RWE 0
> > ...
> > 
> > Note that the "Offset" value of the first PT_LOAD segment has a file offset
> > value of 0x678.
> > 
> 
> here i got the result as below:
> Program Headers:
> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
> NOTE 0x000000 0x00000000 0x00000000 0x00000 0x00000 0
> LOAD 0x000074 0xc0000000 0x00200000 0x2fe00000 0x2fe00000 RWE 0
> 
> so the problem is i don't understand the elf header meaning
> accurately. if i modify code as below, everything is ok for me:
> 
> offset += sizeof(struct elf_phdr);
> phdr->p_offset = offset+0x00200000;
> phdr->p_vaddr = 0xc0000000;
> phdr->p_paddr = 0x00200000;
> phdr->p_filesz = phdr->p_memsz = = MEMSIZE-0x00200000;
> 
> 
> although my modification can make crash utility work well, i want to
> know exactly whether i am doing the right thing.
> 1. our platform has the ddr address from physical address 0x0.
> 2. when compiling Linux kernel, our platform set in .config file:
> CONFIG_PHYS_OFFSET=0x00200000
> 3. when Kernel crash, all ddr content will be dumped, from address
> 0x0~768MB. but kernel data starts from 0x00200000 actually.
> 
> my questions are:
> 1. whether my setting of ELF header is correct this time? the offset,
> paddr, and p_memsz?

I'm not really sure.  Even though you've got it to work OK, I don't
understand your new phdr->p_offset and phdr->p_filesz/phdr->p_memsz 
settings.  The phdr->p_offset value typically points to the beginning
of the physical memory segment, which in your case, would be at physical
address 0x0 at file offset 0x74.  And the phdr->p_filesz/phdr->p_memsz
values are typically equal to the full size of the physical memory 
segment (MEMSIZE).

I only have one ELF ARM dumpfile sample, but it does not have any
physical offset:

 crash> vtop c0000000
 VIRTUAL   PHYSICAL
 c0000000  0

 PAGE DIRECTORY: c0004000
   PGD: c0007000 => 1140e
   PMD: c0007000 => 1140e
  PAGE:        0  (1MB)

   PAGE    PHYSICAL   MAPPING    INDEX CNT FLAGS
 c042d000         0         0         0  0 80000
 crash>

Does "vtop c0000000" work as expected on your vmcore?

Also, can you read the last physical page of memory?  For example, on 
my ARM dump, I can check that by doing this:

 crash> kmem -p | tail -5
 c04dcf60   57fb000         0         0  1 400
 c04dcf80   57fc000         0         0  1 400
 c04dcfa0   57fd000         0         0  1 400
 c04dcfc0   57fe000         0         0  1 400
 c04dcfe0   57ff000         0         0  1 400
 crash> rd -p 57ff000
  57ff000:  ef9f0000                              ....
 crash>

Also, can you confirm that your kernel's symbol list starts
at c0000000, i.e., something like this:

 crash> sym -l
 c0004000 (A) swapper_pg_dir
 c0008000 (t) .init
 c0008000 (T) __init_begin
 c0008000 (T) _sinittext
 c0008000 (T) _stext
 c0008000 (T) stext
 c0008040 (t) __create_page_tables
 c00080e4 (t) __enable_mmu_loc
 c00080f0 (t) __error_a
 c00080f4 (t) __lookup_machine_type
 c0008128 (t) __lookup_machine_type_data
 ...

I just want to make sure that the kernel symbols actually start
at c000000, and not c2000000.

> 2. i am wondering how does crash utility translate virtual address to
> physical address before and after it get the kernel page table?
> before get kernel page table, does it calculate as : (virtual_addr -
> p_vaddr + p_paddr) ? after get kernel page table, does it walk
> through the page table and find out the real physical address
> accordingly?

For kernel unity-mapped kernel virtual addresses, it's not necessary
to walk the page tables.  It simply does this:

 #define VTOP(X) \
         ((unsigned long)(X)-(machdep->kvbase)+(machdep->machspec->phys_base))

You can check your machdep->kvbase and machdep->machspec->phys_base
values by entering "help -m", for example:

 crash> help -m | grep -e kvbase -e phys_base
              kvbase: c0000000
           phys_base: 0
 crash>

Certainly vmalloc (and user-space) virtual addresses require a page
table walkthough, but the arm_kvtop() function does this:

 static int
 arm_kvtop(struct task_context *tc, ulong kvaddr, physaddr_t *paddr, int verbose)
 {
         if (!IS_KVADDR(kvaddr))
                 return FALSE;

         if (!vt->vmalloc_start) {
                 *paddr = VTOP(kvaddr);
                 return TRUE;
         }

         if (!IS_VMALLOC_ADDR(kvaddr)) {
                 *paddr = VTOP(kvaddr);     <=== unity-mapped kernel virtual addresses
                 if (!verbose)
                        return TRUE;
         }

         return arm_vtop(kvaddr, (ulong *)vt->kernel_pgd[0], paddr, verbose);
 }

and where vmalloc addresses fall through and arm_vtop() is called to walk
the page tables.

However, you can translate unity-mapped addresses using the kernel page tables
with the "vtop" command, as shown in the "vtop c000000" example above.

> 3. my real purpose is to get the ftrace content from dump file by
> crash utility , but seem the command trace is not for this case, do
> i need to compile the extension "trace" of crash utility? is there
> any guide to follow?

That's correct.  You can do this:

 $ wget http://people.redhat.com/anderson/crash-6.1.2.tar.gz
 ...
 $ tar xvzmf crash-6.1.2.tar.gz
 ...
 $ cd crash-6.1.2
 $ make
 ...
 $ make extensions
 ...
 $ ./crash vmlinux vmcore
 ...
 crash> extend trace.so
 ./extensions/trace.so: shared object loaded
 crash> help trace
 ...

Dave

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility