RE: "cannot access vmalloc'd module memory" when loading kdump'ed vmcore in crash

"Worth, Kevin" <kevin.worth@xxxxxx> · Thu, 16 Oct 2008 16:08:06 +0000

Yep. My main kernel is the customized 1G userspace / 3G kernelspace, my "dump" kernel is 3G userspace / 1G kernelspace. I originally tried using the customized kernel for both purposes, but I get all kinds of ugly recursive task error messages once the kernel loads part way.

-Kevin
________________________________________
From: crash-utility-bounces@xxxxxxxxxx [crash-utility-bounces@xxxxxxxxxx] On Behalf Of Dave Anderson [anderson@xxxxxxxxxx]
Sent: Thursday, October 16, 2008 6:11 AM
To: Discussion list for crash utility usage,    maintenance and development
Subject: Re:  "cannot access vmalloc'd module memory" when       loading kdump'ed vmcore in crash

----- "Kevin Worth" <kevin.worth@xxxxxx> wrote:

> Tried version 2.0 of kexec-tools (released 7/19/2008) and still have
> the same problem with zero'ed out module info. Sounds like perhaps it
> is narrowed down to the 2.6.20 kernel kdump code (the most difficult
> part to change out, but the part that likely gets a lot of eyes on it
> and probably has the issue fixed in current versions).  :\
>

Certainly possible.  From my understanding, the kexec tools piece on
the primary kernel side is mostly concerned with creating the proper
ELF headers for the second kernel to use when /proc/vmcore is read.
And aside from the bogus p_vaddr fields that were using the hardwired
c0000000 PAGE_OFFSET based values, things looked correct.

When the user-space code on the secondary kernel reads the /proc/vmcore
pseudo-file, the secondary kdump kernel dynamically kmaps and reads the
physical memory from the "oldmem" location, and copies it out to user-space.
Why it would read zeroed-out memory from those locations is hard to understand.

I was looking at the "fs/proc/vmcore.c" sources yesterday in our RHEL5
kernel (2.6.18 + patches) for something that could be a issue with the
1GB/3GB split, and also given that your secondary kdump kernel is
still a 3GB/1GB kernel (is that right?).  But I couldn't see anything
there.

Dave

> -Kevin
>
> -----Original Message-----
> From: crash-utility-bounces@xxxxxxxxxx
> [mailto:crash-utility-bounces@xxxxxxxxxx] On Behalf Of Worth, Kevin
> Sent: Wednesday, October 15, 2008 2:43 PM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: RE:  "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
> Sorry please ignore the second paragraph... I am already running the
> most recent version in Ubuntu, 20070330 from Simon Horman's
> kexec-tools-testing and kernel 2.6.20. ... may try a newer version of
> kexec-tools-testing to see if anything changes.
>
> -Kevin
>
> -----Original Message-----
> From: Worth, Kevin
> Sent: Wednesday, October 15, 2008 2:31 PM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: RE:  "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
> So Dave, at this am I correct in the assumption that it sounds like
> this is not a problem with crash, but with the dump file itself? I
> tried one more go at modifying the kexec-tools to have the correct
> PAGE_OFFSET defined and still got the same type of results (all zeroes
> at the module's address), so that doesn't seem to be it.
>
> Maybe this is a better question to take to the kexec mailing list, but
> do you know where the line is drawn between the kernel support or the
> userspace (kexec-tools)? I'm presuming that the kernel support is tied
> to each kernel (i.e. since I'm on 2.6.20 this issue could have been
> resolved in a more recent kernel). I'm wondering if I can pull a newer
> kexec-tools and that they might work with 2.6.20 and possibly have
> this issue resolved.
>
> -Kevin
>
> -----Original Message-----
> From: crash-utility-bounces@xxxxxxxxxx
> [mailto:crash-utility-bounces@xxxxxxxxxx] On Behalf Of Dave Anderson
> Sent: Wednesday, October 15, 2008 6:53 AM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: Re:  "cannot access vmalloc'd module memory"
> when loading kdump'ed vmcore in crash
>
>
> ----- "Kevin Worth" <kevin.worth@xxxxxx> wrote:
>
> > Hi Dave,
> >
> > Before you responded I noticed that a simple "make modules" didn't
> > work because my kernel wasn't exporting the symbol. Rather than do
> > anything risky/complex which might risk mucking up the
> troubleshooting
> > process, I just rebuilt the kernel. It built just fine and now I
> can
> > load crash and I see "DUMPFILE: /dev/crash" when I load up crash.
> Let
> > me try walking through the steps that you had me do previously,
> this
> > time using /dev/crash instead of /dev/mem and /dev/kmem
>
> You made one small error (but not totally fatal) in the suggested
> steps.
> See my comments below...
>
> >
> > >From my limited understanding of what's going on here, it would
> > appear that the dump file is missing some data, or else crash is
> > looking in the wrong place for it.
>
> The crash utility is a slave to what is indicated in the PT_LOAD
> segments of the ELF header of the kdump vmcore.  In the case of
> the physical memory chunk that starts at 4GB physical on your
> machine,
> this is what's in the ELF header (from your original "crash.log"
> file):
>
> Elf64_Phdr:
>                  p_type: 1 (PT_LOAD)
>                p_offset: 3144876760 (bb7302d8)
>                 p_vaddr: ffffffffffffffff
>                 p_paddr: 100000000
>                p_filesz: 1073741824 (40000000)
>                 p_memsz: 1073741824 (40000000)
>                 p_flags: 7 (PF_X|PF_W|PF_R)
>                 p_align: 0
>
>
> What that says is: for the range of physical memory starting
> at 0x100000000 (p_paddr), the vmcore contains a block of
> memory starting at file offset (p_offset) 3144876760/0xbb7302d8
> that is 1073741824/0x40000000 (p_filesz) bytes long.
>
> More simply put, the 1GB of physical memory from 4GB to 5GB
> can be found in the vmcore file starting at file offset 3144876760.
>
> So if a request for physical memory page 0x100000000 comes
> in, the crash utility reads from vmcore file offset 3144876760.
> If the next physical page were requested, i.e., at 0x100001000,
> it would read from vmcore file offset 3144876760+4096.  It's
> as simple as that -- so when you suggest that "crash is looking
> in the wrong place for it", well, there's nothing that the
> crash utility can do differently.
>
> Now, back to the test sequence:
>
> > ---Live system---
> >
> >       KERNEL: vmlinux-devcrash
> >     DUMPFILE: /dev/crash
> >         CPUS: 2
> >         DATE: Tue Oct 14 16:08:28 2008
> >       UPTIME: 00:02:07
> > LOAD AVERAGE: 0.17, 0.08, 0.03
> >        TASKS: 97
> >     NODENAME: test-machine
> >      RELEASE: 2.6.20-17.39-custom2
> >      VERSION: #1 SMP Tue Oct 14 13:45:17 PDT 2008
> >      MACHINE: i686  (2200 Mhz)
> >       MEMORY: 5 GB
> >          PID: 5628
> >      COMMAND: "crash"
> >         TASK: 5d4c2560  [THREAD_INFO: f3de6000]
> >          CPU: 1
> >        STATE: TASK_RUNNING (ACTIVE)
> >
> > crash> p modules
> > modules = $2 = {
> >   next = 0xf8a3ea04,
> >   prev = 0xf8842104
> > }
> >
> > crash> module 0xf8a3ea00
> > struct module {
> >   state = MODULE_STATE_LIVE,
> >   list = {
> >     next = 0xf8d10484,
> >     prev = 0x403c63a4
> >   },
> >   name =
> >
> "crash\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
> >
> 000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
> >
> 000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> >   mkobj = {
> >     kobj = {
> >       k_name = 0xf8a3ea4c "crash",
> >       name =
> > "crash\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
> >       kref = {
> >         refcount = {
> >           counter = 3
> >         }
> >       },
> >       entry = {
> >         next = 0x403c6068,
> >         prev = 0xf8d104e4
> >       },
> >       parent = 0x403c6074
> > ...
> >
> > crash> vtop 0xf8a3ea00
> > VIRTUAL   PHYSICAL
> > f8a3ea00  116017a00
>
> OK -- so the physical memory location of the module data structure
> is at physical address 116017a00, but...
> >
> > PAGE DIRECTORY: 4044b000
> >   PGD: 4044b018 => 6001
> >   PMD:     6e28 => 1d51a067
> >   PTE: 1d51a1f0 => 116017163
> >  PAGE: 116017000
> >
> >    PTE     PHYSICAL   FLAGS
> > 116017163  116017000  (PRESENT|RW|ACCESSED|DIRTY|GLOBAL)
> >
> >   PAGE     PHYSICAL   MAPPING    INDEX CNT FLAGS
> > 472c02e0  116017000         0    229173  1 80000000
> >
>
> You're reading from the beginning of the page, i.e., 116017000
> instead of where the module structure is at 116017a00:
>
> > crash> rd -p 116017000 30
> > 116017000:  53e58955 d089c389 4d8bca89 74c98508   U..S.......M...t
> > 116017010:  01e9831f b85b0d74 ffffffea ffffba5d   ....t.[.....]...
> > 116017020:  03c3ffff 53132043 26b48d24 00000000   ....C .S$..&....
> > 116017030:  89204389 5d5b2453 26b48dc3 00000000   .C .S$[]...&....
> > 116017040:  83e58955 55892cec 08558be4 89f45d89   U....,.U..U..]..
> > 116017050:  7d89f875 ffeabffc 4d89ffff 8b028be0   u..}.......M....
> > 116017060:  c3890452 ac0fd689 45890cf3 0ceec1ec   R..........E....
> > 116017070:  5589c889 89d231f0                     ...U.1..
> > crash>
> >
>
> So therefore you're not seeing the "crash" strings embedded in
> the raw physical data.  Now, although it would have been "nice"
> if you could have shown the contents of the module structure via
> the physical address, the fact remains that since you used the
> /dev/crash driver, the "module 0xf8a3ea00" command required that
> the crash utility first translate the vmalloc address into its
> physical equivalent, and then read from there.
>
> In any case, you do have a dump of physical memory from 116017000
> which at least is in the same 4k page as the module data structure,
> so it should not change when read from the dumpfile.
>
> > ---Using dump file---
> >
> >
> > please wait... (gathering module symbol data)
> > WARNING: cannot access vmalloc'd module memory
> >
> >       KERNEL: vmlinux-devcrash
> >     DUMPFILE: /var/crash/vmcore
> >         CPUS: 2
> >         DATE: Tue Oct 14 16:09:32 2008
> >       UPTIME: 00:03:12
> > LOAD AVERAGE: 0.09, 0.08, 0.02
> >        TASKS: 97
> >     NODENAME: test-machine
> >      RELEASE: 2.6.20-17.39-custom2
> >      VERSION: #1 SMP Tue Oct 14 13:45:17 PDT 2008
> >      MACHINE: i686  (2200 Mhz)
> >       MEMORY: 5 GB
> >        PANIC: "[  192.148000] SysRq : Trigger a crashdump"
> >          PID: 0
> >      COMMAND: "swapper"
> >         TASK: 403c0440  (1 of 2)  [THREAD_INFO: 403f2000]
> >          CPU: 0
> >        STATE: TASK_RUNNING (SYSRQ)
> >
> > crash> p modules
> > modules = $2 = {
> >   next = 0xf8a3ea04,
> >   prev = 0xf8842104
> > }
> >
> > crash> module 0xf8a3ea00
> > struct module {
> >   state = MODULE_STATE_LIVE,
> >   list = {
> >     next = 0x0,
> >     prev = 0x0
> >   },
> >   name =
> >
> "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> >
> 00\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> >
> 00\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> > 00\000",
> >   mkobj = {
> >     kobj = {
> >       k_name = 0x0,
> >       name =
> > "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0
> > 00\000\000",
> >       kref = {
> >         refcount = {
> >           counter = 0
> >         }
> >       },
> >       entry = {
> >         next = 0x0,
> >         prev = 0x0
> > ...
> >
> > crash> vtop 0xf8a3ea00
> > VIRTUAL   PHYSICAL
> > f8a3ea00  116017a00
> >
> > PAGE DIRECTORY: 4044b000
> >   PGD: 4044b018 => 6001
> >   PMD:     6e28 => 1d51a067
> >   PTE: 1d51a1f0 => 116017163
> >  PAGE: 116017000
> >
> >    PTE     PHYSICAL   FLAGS
> > 116017163  116017000  (PRESENT|RW|ACCESSED|DIRTY|GLOBAL)
> >
> >   PAGE     PHYSICAL   MAPPING    INDEX CNT FLAGS
> > 472c02e0  116017000         0    229173  1 80000000
> >
> > crash> rd -p 116017000 30
> > 116017000:  00000000 00000000 00000000 00000000   ................
> > 116017010:  00000000 00000000 00000000 00000000   ................
> > 116017020:  00000000 00000000 00000000 00000000   ................
> > 116017030:  00000000 00000000 00000000 00000000   ................
> > 116017040:  00000000 00000000 00000000 00000000   ................
> > 116017050:  00000000 00000000 00000000 00000000   ................
> > 116017060:  00000000 00000000 00000000 00000000   ................
> > 116017070:  00000000 00000000                     ........
> > crash>
>
> Now we're reading the same physical address as you did on
> the dumpfile, and it's returning all zeroes.  And the
> "module 0xf8a3ea00" above shows all zeroes from a higher
> location in the page because the same vmalloc translation is
> done to turn it into a physical address before reading it
> from the vmcore file.  But instead of using the /dev/crash driver
> to access the translated physical memory, the crash utility
> uses the information from the ELF header's PT_LOAD segments
> to find out where to find the page data in the vmcore file.
>
> So, anyway, the "rd -p 116017000 30" command that you did
> on both the live system and the dumpfile should yield the same
> data.
>
> It seems like in all examples to date, the file data read
> at the greater-than-4GB PT_LOAD segment returns zeroes.
>
> You can verify this from the crash utility's viewpoint by
> doing a "help -n" during runtime when running with the dumpfile,
> which will show you both the actual contents of the ELF header,
> as well as the manner in which the PT_LOAD data is stored for
> its use.  (It's also shown with the "crash -d7 ..." output).
>
> So again, from your original "crash.log" file, here is what the
> ELF header's PT_LOAD segment contains:
>
> Elf64_Phdr:
>                  p_type: 1 (PT_LOAD)
>                p_offset: 3144876760 (bb7302d8)
>                 p_vaddr: ffffffffffffffff
>                 p_paddr: 100000000
>                p_filesz: 1073741824 (40000000)
>                 p_memsz: 1073741824 (40000000)
>                 p_flags: 7 (PF_X|PF_W|PF_R)
>                 p_align: 0
>
> And this is what the crash utility stored in its internal
> data structure for that particular segment:
>
>      pt_load_segment[4]:
>             file_offset: bb7302d8
>              phys_start: 100000000
>                phys_end: 140000000
>               zero_fill: 0
>
> And when the physical memory read request comes in, it filters
> to this part of the crash utility's read_netdump() function in
> netdump.c:
>
>                 for (i = offset = 0; i < nd->num_pt_load_segments;
> i++) {
>                         pls = &nd->pt_load_segments[i];
>                         if ((paddr >= pls->phys_start) &&
>                             (paddr < pls->phys_end)) {
>                                 offset = (off_t)(paddr -
> pls->phys_start) +
>                                         pls->file_offset;
>                                 break;
>                         }
>                         if (pls->zero_fill && (paddr >= pls->phys_end)
> &&
>                             (paddr < pls->zero_fill)) {
>                                 memset(bufptr, 0, cnt);
>                                 return cnt;
>                         }
>                 }
>
> So for any physical address request between 100000000 to 140000000,
> (4GB to 5GB) it will calculate the offset to seek to by subtracting
> 100000000 from the incoming physical address, and adding the
> difference
> to the starting file offset of the whole segment.
>
> So if you wanted to, you could put debug code just prior to the
> "break" above
> that shows the pls->file_offset for a given incoming physical
> address.
> But this code has been in place forever, so it's hard to conceive
> that
> somehow it's not working in the case of this dumpfile.  But presuming
> that
> it *does* go to the correct file offset location in the vmcore, and
> it's
> getting bogus data from there, then there's nothing that the crash
> utility can do about it.
>
> Dave
>
>
>
>
>
>
> --
> Crash-utility mailing list
> Crash-utility@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/crash-utility

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility