I have a dump from a 2.6.31-based x86_64 system where the number of "possible" cpus equals the system's NR_CPUS (32). On that system, the __per_cpu_offset table in the kernel consists of 32 valid offset pointers. When crash loads this table into its __per_cpu_offset[NR_CPUS=4096] array in struct kernel_table, it knows the length of the kernel's array (32*sizeof(long)), and copies the 32 pointers, leaving the rest of its (much longer) array full of 0x0s. (This happens in kernel.c) 193 if (symbol_exists("__per_cpu_offset")) { 194 if (LKCD_KERNTYPES()) 195 i = get_cpus_possible(); 196 else 197 i = get_array_length("__per_cpu_offset", NULL, 0); 198 get_symbol_data("__per_cpu_offset", 199 sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS), 200 &kt->__per_cpu_offset[0]); 201 kt->flags |= PER_CPU_OFF; 202 } Later, in a couple of places, crash checks for the maximum valid __per_cpu_offset by reading the cpu_number value out of each per_cpu area and comparing it to the expected number until the comparison fails. (Remember NR_CPUS in crash is much larger then the kernel's NR_CPUS, and that's OK). >From x86_64.c: 4201 for (i = cpus = 0; i < NR_CPUS; i++) { 4202 readmem(symbol_value("per_cpu__cpu_number") + 4203 kt->__per_cpu_offset[i], KVADDR, 4204 &cpunumber, sizeof(int), 4205 "cpu number (per_cpu)", FAULT_ON_ERROR); 4206 if (cpunumber != cpus) 4207 break; 4208 cpus++; 4209 } This works well when the kernel's array has fewer real per_cpu_offsets than its own NR_CPUS, since the kernel preloads its array with a pointer (BOOT_PERCPU_OFFSET) and when this loop runs past the real per_cpu_offset pointers and tries to use the BOOT_PERCPU_OFFSET, it reads a bogus value for cpunumber and terminates. But when the kernel's table is full of valid per_cpu_offset pointers, this loop continues off the end of that into the part of crash's __per_cpu_offset array that has the 0x0 initial values, and dies with: crash: invalid kernel virtual address: cc08 type: "cpu number (per_cpu)" The cc08 comes from the symbol_value of per_cpu__cpu_number: 000000000000cc08 D per_cpu__cpu_number Bottom line: Crash is assuming an insufficient array termination for the kernel's __per_cpu_offset array (a pointer that points to an invalid cpu_number). The included patch adds an additional loop termination so that crash doesn't run off the end of what it loaded from the dump. It just checks for a NULL 0x0 value in kt->__per_cpu_offset[i]. Bob Montgomery, Working at HP
--- x86_64.c.orig 2009-11-10 10:43:54.000000000 -0700 +++ x86_64.c 2009-11-10 10:41:23.000000000 -0700 @@ -791,6 +791,8 @@ x86_64_per_cpu_init(void) ms = machdep->machspec; for (i = cpus = 0; i < NR_CPUS; i++) { + if (kt->__per_cpu_offset[i] == NULL) + break; readmem(symbol_value("per_cpu__cpu_number") + kt->__per_cpu_offset[i], KVADDR, &cpunumber, sizeof(int), @@ -4199,6 +4201,8 @@ x86_64_get_smp_cpus(void) return 1; for (i = cpus = 0; i < NR_CPUS; i++) { + if (kt->__per_cpu_offset[i] == NULL) + break; readmem(symbol_value("per_cpu__cpu_number") + kt->__per_cpu_offset[i], KVADDR, &cpunumber, sizeof(int),
-- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility