Re: [PATCH 0/5] Second phase of future support for x86_64 5-level page tables

Dave Anderson <anderson@xxxxxxxxxx> · Thu, 11 Jan 2018 15:56:08 -0500 (EST)

----- Original Message -----
> I found Dave had alread done the first phase of future support for x86_64
> 5-level page tables(commit 307e7f35f510). when I asked him about the
> state of this work, he gave me a more detailed answer and suggestion.
> I follow his advice, and do the following job.
> 
> 
> 1. Refine the original logical:
>   1) Create some new common function for getting the offset of page table
>   2) Repace the PML4 and UPML with the common PGD:
>      machdep->machspec->pml4/upml ==> machdep->pdg
>   3) Using the PUD in x86_64
> 
> 2. Add 5-level page tables support for x86_64_k/uvtop()
> 
> This patchset is the second phase of the work, As Dave said, we need to be
> a manner of determining very early on whether the kernel page tables are
> using 5-level and  whether each user-space task is using 4- or 5-level page
> tables. These will be done after this phase.
> 
> About test work:
> 
> I have tested this patchset with 4-level and 5-level paging table.
> 
> sadump/ Xen/ Old Linux / RHEL4 are not be tested.

Hello Dou,

Thank you very much for the work you have done so far.  I have not spent
any time looking at the patches in detail, but instead I first ran a quick
test of the patch on a set of ~250 kernels that I keep around for testing, 
where I just ran the "mod" command to at least verify that kernel virtual
addresses could be translated.  

Now, as always, backwards compatibility must be maintained.  I do not have
any sadump dumpfiles, but obviously you (Fujitsu) can test those.  However
I do have some older Xen and RHEL4-era kernels in my sample set.  

As it turns out, *all* RHEL4 kernels failed (i.e. any kernel version 
earlier than 2.6.9), which report "WARNING: cannot access vmalloc'd 
module memory" during initialization when trying to gather the kernel
module list.

For all of the 2.6.9 and earlier kernels, they show the "WARNING: cannot
access vmalloc'd module memory" message during session initialization:

  $ crash vmlinux-2.6.9-42.0.2.ELsmp.gz vmcore

  crash 7.2.1rc26
  Copyright (C) 2002-2017  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005, 2011  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.

  GNU gdb (GDB) 7.6
  Copyright (C) 2013 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu"...

  please wait... (gathering module symbol data)   
  WARNING: cannot access vmalloc'd module memory

        KERNEL: vmlinux-2.6.9-42.0.2.ELsmp.gz
      DUMPFILE: vmcore
          CPUS: 8
          DATE: Tue Nov 21 19:14:17 2006
        UPTIME: 6 days, 01:23:25
  LOAD AVERAGE: 24.34, 7.89, 4.46
         TASKS: 865
      NODENAME: lonrs00268
       RELEASE: 2.6.9-42.0.2.ELsmp
       VERSION: #1 SMP Thu Aug 17 17:57:31 EDT 2006
       MACHINE: x86_64  (2199 Mhz)
        MEMORY: 16 GB
         PANIC: "Kernel BUG at panic:75"
           PID: 20046
       COMMAND: "oracle"
          TASK: 101c6b047f0  [THREAD_INFO: 101a428a000]
           CPU: 7
         STATE: TASK_RUNNING (NMI)

  crash> 

If I run the session with "crash -d4 vmlinux-2.6.9-42.0.2.ELsmp.gz vmcore",
you can see that it it reads a "pud page", but then fails:

  ...
  please wait... (gathering module symbol data)module: ffffffffa0634180
  <readmem: ffffffffa0634180, KVADDR, "module struct", 1408, (ROE|Q), f73780>
  <readmem: 4f8000, PHYSADDR, "pud page", 4096, (FOE), 2080b40>
  <read_diskdump: addr: 4f8000 paddr: 4f8000 cnt: 4096>

  crash: invalid kernel virtual address: ffffffffa0634180  type: "module struct"

  WARNING: cannot access vmalloc'd module memory
  ...

Without the patch, the module virtual address translation succeeds:

  ...
  please wait... (gathering module symbol data)module: ffffffffa0634180
  <readmem: ffffffffa0634180, KVADDR, "module struct", 1408, (ROE|Q), f705e0>
  <readmem: 103000, PHYSADDR, "pgd page", 4096, (FOE), 25d7b50>
  <read_diskdump: addr: 103000 paddr: 103000 cnt: 4096>
  <readmem: 105000, PHYSADDR, "pmd page", 4096, (FOE), 25d8b60>
  <read_diskdump: addr: 105000 paddr: 105000 cnt: 4096>
  <readmem: d9bfb0000, PHYSADDR, "page table", 4096, (FOE), 25d9b70>
  <read_diskdump: addr: d9bfb0000 paddr: d9bfb0000 cnt: 4096>
  <read_diskdump: addr: ffffffffa0634180 paddr: d9bfb3180 cnt: 1408>
  ...

So it appears to be reading from the wrong starting page table location,
i.e., from "pud page 4f8000" instead of "pgd page 103000".

Also, several Xen kernels failed with segmentation violations during
session initialization.  They all fail here in x86_64_xendump_load_page(),
when "*pgd" gets referenced:

  static char *
  x86_64_xendump_load_page(ulong kvaddr, struct xendump_data *xd)
  {
          ulong mfn;
          ulong *pgd, *pud, *pmd, *ptep;

          pgd = ((ulong *)machdep->pgd) + pgd_index(kvaddr);
          mfn = ((*pgd) & PHYSICAL_PAGE_MASK) >> PAGESHIFT();  
                  ^^^^

Here is the relevant part of the gdb trace of a 2.6.18-based xen
kernel:

Program terminated with signal 11, Segmentation fault.
#0  0x0000000000502748 in x86_64_xendump_load_page (kvaddr=kvaddr@entry=18446744071568498888, xd=0xf521a0 <xendump_data>, 
    xd=0xf521a0 <xendump_data>) at x86_64.c:7003
7003		mfn = ((*pgd) & PHYSICAL_PAGE_MASK) >> PAGESHIFT();
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7.x86_64 libgcc-4.8.5-11.el7.x86_64 libstdc++-4.8.5-11.el7.x86_64 lzo-2.06-8.el7.x86_64 ncurses-libs-5.9-13.20130511.el7.x86_64 snappy-1.1.0-3.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x0000000000502748 in x86_64_xendump_load_page (kvaddr=kvaddr@entry=18446744071568498888, xd=0xf521a0 <xendump_data>, 
    xd=0xf521a0 <xendump_data>) at x86_64.c:7003
#1  0x0000000000503191 in x86_64_xendump_p2m_create (xd=0xf521a0 <xendump_data>) at x86_64.c:6749
#2  0x0000000000565d4e in xc_core_create_pfn_tables () at xendump.c:1258
#3  xc_core_read (addr=<optimized out>, paddr=7080864, cnt=32, bufptr=0xf70f80 <shared_bufs>) at xendump.c:168
#4  read_xendump (fd=<optimized out>, bufptr=0xf70f80 <shared_bufs>, cnt=32, addr=<optimized out>, paddr=7080864) at xendump.c:836
#5  0x000000000047b038 in readmem (addr=18446744071569148832, memtype=memtype@entry=1, buffer=buffer@entry=0xf70f80 <shared_bufs>, 
    size=size@entry=32, type=type@entry=0x94dcc3 "possible", error_handle=error_handle@entry=2) at memory.c:2233
#6  0x00000000004ea33e in cpu_maps_init () at kernel.c:903
#7  kernel_init () at kernel.c:118
#8  0x0000000000467e5a in main_loop () at main.c:768
#9  0x000000000069dad3 in captured_command_loop (data=data@entry=0x0) at main.c:258
#10 0x000000000069c37a in catch_errors (func=func@entry=0x69dac0 <captured_command_loop>, func_args=func_args@entry=0x0, 
    errstring=errstring@entry=0x8e713f "", mask=mask@entry=6) at exceptions.c:557
#11 0x000000000069ea66 in captured_main (data=data@entry=0x7ffd637c92a0) at main.c:1064
#12 0x000000000069c37a in catch_errors (func=func@entry=0x69dda0 <captured_main>, func_args=func_args@entry=0x7ffd637c92a0, 
    errstring=errstring@entry=0x8e713f "", mask=mask@entry=6) at exceptions.c:557
#13 0x000000000069edc7 in gdb_main (args=0x7ffd637c92a0) at main.c:1079
#14 gdb_main_entry (argc=<optimized out>, argv=argv@entry=0x7ffd637c9408) at main.c:1099
#15 0x00000000004f0604 in gdb_main_loop (argc=<optimized out>, argc@entry=3, argv=argv@entry=0x7ffd637c9408) at gdb_interface.c:76
#16 0x00000000004662c5 in main (argc=3, argv=0x7ffd637c9408) at main.c:707
(gdb) p pgd
$1 = (ulong *) 0xfffffffc054f4210
(gdb) 

I haven't investigated further, but in all of the xen cases, the
value of "pgd" above was a kernel virtual address as shown in the 
example above.

However, without the patch, the function looks like this, and with 
my debug printf of "pml4", the address is a user-space address as
expected:

  static char *
  x86_64_xendump_load_page(ulong kvaddr, struct xendump_data *xd)
  {
          ulong mfn;
          ulong *pml4, *pgd, *pmd, *ptep;

          pml4 = ((ulong *)machdep->machspec->pml4) + pml4_index(kvaddr);
          mfn = ((*pml4) & PHYSICAL_PAGE_MASK) >> PAGESHIFT();

  fprintf(fp, "x86_64_xendump_load_page: pml4: %lx\n", pml4);

  ...

So for example, with the debug statement, I see this:

  # crash vmlinux-2.6.18-1.2714.el5xen.gz xguest-crashdump

  crash 7.2.1rc26
  Copyright (C) 2002-2017  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005, 2011  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.

  GNU gdb (GDB) 7.6                                                                   
  Copyright (C) 2013 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu"...

  x86_64_xendump_load_page: pml4: 25d6c08
  x86_64_xendump_load_page: pml4: 25d6c08
        KERNEL: vmlinux-2.6.18-1.2714.el5xen.gz
      DUMPFILE: xguest-crashdump
  ...

In a private email, I will send you a pointer to where I have temporarily 
stored the 2 vmlinux/vmcore pairs shown above.  I'm thinking that it will
probably be fairly easy for you to figure out what's happening in both cases.

Again, I very much appreciate the work you have undertaken here.

Thanks,
  Dave

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility