Re: [PATCH v2] Add read diagnostics to crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave,

On 01/10/2012 09:24 AM, Dave Anderson wrote:


----- Original Message -----
Re-posting the patch as an attachment. I'm sorry for the mess on the
first pass.

--
David.

Hi Dave,

Your patch has merit, but I'm going to pare it down a bit.  I use CRASHDEBUG(8)
fairly frequently, but with your changes as-is, it's far too verbose.  It's also
a bit redundant -- here's an example of a CRASHDEBUG(8) kdump read currently:

   crash>  rd ffff88003c1a5cc0
   <addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
   <readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff0096b5a8>
       addr: ffff88003c1a5cc0  paddr: 3c1a5cc0  cnt: 8
   ffff88003c1a5cc0:  0000000000000000                    ........
        text hit rate: 65% (3505 of 5378)
   crash>

And then with your patch:

   crash>  rd ffff88003c1a5cc0
   <addr: ffff88003c1a5cc0 count: 1 flag: 490 (KVADDR)>
   <readmem: ffff88003c1a5cc0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff1c97b6e8>
       addr: ffff88003c1a5cc0  paddr: 3c1a5cc0  cnt: 8
   rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
   rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
   rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)
   rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
   rd: seek and read offset: 3c195fb8
   ffff88003c1a5cc0:  0000000000000000                    ........
        text hit rate: 65% (3505 of 5378)
   crash>

The "<readmem: ...>" and subsequent "addr: ..." lines come from readmem().
I don't understand the need for the next 3 lines, as they are completely
  redundant:

Except that one intention was to show program flow (failures that didn't happen) as well as continuation of program state.

*	read_kdump() entered
rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0

*	None of the conditional fatal error exits in read_kdump()
rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0

*	read_netdump() entered
rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)

*	switch (DUMPFILE_FORMAT(nd->flags)) VIA case KDUMP_ELF64
*	show calculated offset
rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8

*	Not the if (FLAT_FORMAT()) read case
rd: seek and read offset: 3c195fb8

   rd: read_kdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
   rd: read_kdump trying read_netdump for: virt=ffff88003c1a5cc0 phys=3c1a5cc0
   rd: read_netdump(-1, 7fff1c97b6e8, 8, ffff88003c1a5cc0, 3c1a5cc0)

I don't argue with the displays of the resultant file offsets, but I may move
them to CRASHDEBUG(9):

I wouldn't object and can re-post a CRASHDEBUG(9) version myself if you'd prefer.

   rd: read KDUMP or NETDUMP_ELF64 for multi-segment: 3c195fb8
   rd: seek and read offset: 3c195fb8

And the same goes for compressed kdumps:

   crash>  rd ffff81005286e0c0
   <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
   <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fff11d16958>
       addr: ffff81005286e0c0  paddr: 5286e0c0  cnt: 8
   ffff81005286e0c0:  0000000000000000                    ........
        text hit rate: 66% (239 of 357)
   crash>

With the patch, aside from the "pfn" displays, it's redundant information:

Except for the implied flow and continuation of state. I've been asked to look into alleged crash bugs that were no more than corrupt or incomplete dumps that required either hexdump style deciphering of the structure or a method of crash exposing its own deciphering of the structure because the program starts and exits with no more than an error code that can't always identify a unique point of failure and you'd need to find the cause in source anyway. For example, a case where an incomplete crash dump has page-present flags showing the page not present for kernel data structures (every single page present flag is zero in-fact). IOW, no crash bug because kdump created an incomplete crash dump.

   crash>  rd ffff81005286e0c0
   <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
   <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fffd3ca4478>
       addr: ffff81005286e0c0  paddr: 5286e0c0  cnt: 8
   rd: read_diskdump(-1, 7fffd3ca4478, 8, ffff81005286e0c0, 5286e0c0), pfn=5286e
   rd: Buffering page with pfn=5286e and current physaddr=5286e000
   ffff81005286e0c0:  0000000000000000                    ........
        text hit rate: 66% (239 of 357)
   crash>

For starters, I think the patch can be simplified by modifying the "addr: ..."
line that is displayed by readmem() just prior to issuing it to the configured
pc->readmem function so that it shows the receiving function.  So that line,
which may be displayed several times if the read crosses a page boundary, could
be reworked to show something like this:

   crash>  rd ffff81005286e0c0
   <addr: ffff81005286e0c0 count: 1 flag: 490 (KVADDR)>
   <readmem: ffff81005286e0c0, KVADDR, "64-bit KVADDR", 8, (FOE), 7fffd3ca4478>
   <read_netdump: addr: ffff81005286e0c0 paddr: 5286e0c0 cnt: 8
   ...

and then a line or two from the relevant read functions would show
the file offset that was resolved from the physical address.

Also, the new function that translates the pc->readmem function address to a
string for display could also be used in a single location in main.c instead of
having to put debug statements in each location where where pc->readmem is
initialized.

Isn't there at least one case, e.g. x86_64_xen_kdump_p2m_create() that replaces and restores the pc->readmem value at runtime?

So let me play around with this a bit...

You are welcome too but I would be happy to take responsibility and re-work it too. If you do go ahead with a re-work then please consider my goal of monitoring program flow.

Thanks,
David

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility


[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux