Hi Dave,
On 01/12/2012 02:01 PM, Dave Anderson wrote:
----- Original Message -----
Hi Dave,
On 01/12/2012 12:58 PM, Dave Anderson wrote:
----- Original Message -----
On 01/11/2012 06:53 AM, Petr Tesarik wrote:
Dne Út 10. ledna 2012 19:23:24 Petr Tesarik napsal(a):
Dne Út 10. ledna 2012 19:14:32 Petr Tesarik napsal(a):
... [ cut ] ...
crash> vtop f2800080
VIRTUAL PHYSICAL
f2800080 1fde00080
PAGE DIRECTORY: c08ed000
PGD: c08ed018 => 8ea001
PMD: 8eaca0 => 80000001fde001e3
PAGE: 1fde00000 (2MB)
PTE PHYSICAL FLAGS
80000001fde001e3 1fde00000
(PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX)
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
BTW the data from struct page is really missing here. I traced
this down to an
integer overflow in dump_memory_nodes():
... [ cut ] ...
David (Mair), could you address this, as already discussed in
private mails,
please?
The attached patch fixes this for me.
Hmmm, not so much for me...
When I test the patch on RHEL5, RHEL6 and Fedora x86 kernels, the
command always fails like this:
crash> kmem -n
NODE SIZE PGLIST_DATA BOOTMEM_DATA NODE_ZONES
0 262075 c0a3a680 c0aa5ce8 c0a3a680
c0a3b1c0
c0a3bd00
c0a3c840
MEM_MAP START_PADDR START_MAPNR
Segmentation fault
$
I haven't look into it, and this is probably not related:
cc -c -g -DX86 -m32 -D_FILE_OFFSET_BITS=64 -DGDB_7_3_1 memory.c
-Wall -O2 -Wstrict-prototypes -Wmissing-prototypes
-fstack-protector
memory.c: In function 'dump_memory_nodes':
memory.c:13410: warning: cast to pointer from integer of different
size
memory.c:13199: warning: 'node_start_paddr' may be used
uninitialized in this function
But from under gdb:
crash> kmem -n
[Detaching after fork from child process 16870.]
Program received signal SIGSEGV, Segmentation fault.
0x0809cd94 in mkstring (s=0xffe8a8bc " c0a3a680 ", size=16,
flags=133, opt=0x1000<Address 0x1000 out of bounds>)
at tools.c:1620
1620 sprintf(s, "%llx", *((ulonglong *)opt));
(gdb) bt
#0 0x0809cd94 in mkstring (s=0xffe8a8bc " c0a3a680 ", size=16,
flags=133, opt=0x1000<Address 0x1000 out of bounds>)
at tools.c:1620
#1 0x080b1ad7 in dump_memory_nodes (initialize=0) at
memory.c:13406
#2 0x080d46cc in cmd_kmem () at memory.c:4221
#3 0x080941b8 in exec_command () at main.c:751
#4 0x08093fe6 in main_loop () at main.c:699
#5 0x081db622 in current_interp_command_loop ()
#6 0x081dbf01 in captured_command_loop ()
#7 0x081db0a4 in catch_errors ()
#8 0x081dce07 in captured_main ()
#9 0x081db0a4 in catch_errors ()
#10 0x081dce49 in gdb_main ()
#11 0x081dce99 in gdb_main_entry ()
#12 0x08116668 in gdb_main_loop (argc=2, argv=0xffe8d494) at
gdb_interface.c:75
#13 0x08093ce0 in main (argc=3, argv=0xffe8d494) at main.c:603
(gdb) p opt
$1 = 0x1000<Address 0x1000 out of bounds>
(gdb)
I thought it might be just an x86 issue, but it also fails
the same way on RHEL5, RHEL6 and Fedora x86_64 kernels.
I'm looking into it, that's the change I made to the fprintf() that uses
node_start_paddr. The patch would be a fix for the problem Petr reported
if it did not include the expansion to 64-bits in the fprintf() that's
the whole last piece of the patch. The fix is pretty obvious now to the
original patch:
LONGLONG_HEX takes a ulonglong *
LONG_HEX takes a ulong
Changing the last modified MKSTR() to take&node_start_paddr should
resolve it. Patch attached.
--
David.
Right -- our mails crossed...
But I don't want the output format to change, especially on 64-bit
machines. For example on x86_64, it currently looks like these
examples, where the 3 fields are centered:
MEM_MAP START_PADDR START_MAPNR
ffff8100006e6000 0 0
MEM_MAP START_PADDR START_MAPNR
ffffea0004280000 130000000 1245184
MEM_MAP START_PADDR START_MAPNR
ffffea0000000038 1000 1
With your patch they looks like this:
crash> kmem -n
...
MEM_MAP START_PADDR START_MAPNR
ffff8100006e6000 0 0
MEM_MAP START_PADDR START_MAPNR
ffffea0004280000 130000000 1245184
MEM_MAP START_PADDR START_MAPNR
ffffea0000000038 1000 1
And even on 32-bit, it seems to vary. Here's 3 without
the patch:
MEM_MAP START_PADDR START_MAPNR
c9000000 0 0
MEM_MAP START_PADDR START_MAPNR
c188a020 1000 1
MEM_MAP START_PADDR START_MAPNR
f5d2c200 10000 16
And with it applied:
MEM_MAP START_PADDR START_MAPNR
c9000000 0 0
MEM_MAP START_PADDR START_MAPNR
c188a020 1000 1
MEM_MAP START_PADDR START_MAPNR
f5d2c200 10000 16
It should be possible to incorporate the variable-size
adjustment without changing the display.
The position used as the center for column two is offset three places to
the right with my patch but the numbers are still centered. The position
used as column three is offset five places to the right but the numbers
are actually still centered. It's because of the spaces I added in the
fprintf() format string thinking I was matching my expansion of the header:
- fprintf(fp, "%s %s %s\n",
+ fprintf(fp, "%s %s %s\n",
The overall problem I was trying to fix is that there isn't enough room
for a 16 character 64-bit hexadecimal address under the previous
header's START_PADDR when the maximum lengths are used for the other
fields. I inserted three spaces before and two spaces after START_PADDR
in the header and increased the strlen() contents the same way. But I
also did it for the format string even though mkstring() is inserting
them already based on the strlen() value. I've attached a patch that
contains version 3 of the complete handling of 64 bit expansion of
node_start_paddr and get this:
crash> kmem -n
[...]
MEM_MAP START_PADDR START_MAPNR
f4a02200 10000 16
[...]
MEM_MAP START_PADDR START_MAPNR
f2802000 100000000 1048576
--
David Mair
SUSE Linux
In dump_memory_nodes() node_start_paddr is too small for all NUMA x86
addresses. Those are 44 bits derived from a 32 bit read and PTOB() expands
the result neatly to 64 bits. Standard x86 pages require a 32 bit read of
the physical address so a 32 bit local, temp_node_start_paddr, was added to
retain the 32 bit read for the page-table field type and compiler expansion
to 64 bits into the common local in dump_memory_nodes(). Another use of
node_start_paddr was modified to assume 64 bits. I also modified the indent
of node_start_paddr to use tabs like the preceding lines. Using the same crash
dump Petr used in his test case I see that the vtop f2800080 output now
includes struct page data in the last table that does not appear before this
patch was added.
This patch is based on crash-6.0.2
Signed-off-by: David Mair <dmair@xxxxxxxx>
---
diff --git a/memory.c b/memory.c
index 95eefc9..681abb9 100755
--- a/memory.c
+++ b/memory.c
@@ -13192,7 +13192,8 @@ dump_memory_nodes(int initialize)
int i, j;
int n, id, node, flen, slen, badaddr;
ulong node_mem_map;
- ulong node_start_paddr;
+ ulong temp_node_start_paddr;
+ ulonglong node_start_paddr;
ulong node_start_pfn;
ulong node_start_mapnr;
ulong node_spanned_pages, node_present_pages;
@@ -13286,10 +13287,12 @@ dump_memory_nodes(int initialize)
badaddr = TRUE;
}
- if (VALID_MEMBER(pglist_data_node_start_paddr))
+ if (VALID_MEMBER(pglist_data_node_start_paddr)) {
readmem(pgdat+OFFSET(pglist_data_node_start_paddr),
- KVADDR, &node_start_paddr, sizeof(ulong),
+ KVADDR, &temp_node_start_paddr, sizeof(ulong),
"pglist node_start_paddr", FAULT_ON_ERROR);
+ node_start_paddr = temp_node_start_paddr;
+ }
else if (VALID_MEMBER(pglist_data_node_start_pfn)) {
readmem(pgdat+OFFSET(pglist_data_node_start_pfn),
KVADDR, &node_start_pfn, sizeof(ulong),
@@ -13394,14 +13397,14 @@ dump_memory_nodes(int initialize)
fprintf(fp, "%lx\n", node_zones);
}
- fprintf(fp, "%s START_PADDR START_MAPNR\n",
+ fprintf(fp, "%s START_PADDR START_MAPNR\n",
mkstring(buf1, VADDR_PRLEN, CENTER|LJUST,
"MEM_MAP"));
fprintf(fp, "%s %s %s\n",
mkstring(buf1, VADDR_PRLEN,
CENTER|LONG_HEX, MKSTR(node_mem_map)),
mkstring(buf2, strlen("START_PADDR"),
- CENTER|LONG_HEX|RJUST, MKSTR(node_start_paddr)),
+ CENTER|LONGLONG_HEX|RJUST, MKSTR(&node_start_paddr)),
mkstring(buf3, strlen("START_MAPNR"),
CENTER|LONG_DEC|RJUST,
MKSTR(node_start_mapnr)));
--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility