Boot failure on x86_64 (OOPS set_cpu_sibling_map() )

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Today's Next failed to boot on a x86_64 box with following traces

ACPI: Core revision 20090625
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81328c7b>] set_cpu_sibling_map+0x24f/0x353
PGD 0
Oops: 0002 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.31-rc4-autotest-next-20090730-5-default #1 BladeCenter LS21 -[79716AA]-
RIP: 0010:[<ffffffff81328c7b>]  [<ffffffff81328c7b>] set_cpu_sibling_map+0x24f/0x353
RSP: 0018:ffff88012b319e20  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000cbdc
RDX: 000000000000cbf0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88012b319e80 R08: 0000000000000004 R09: ffff880028092bc0
R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
R13: ffff8800280362c0 R14: ffff8800280362c0 R15: 00000000000142c0
FS:  0000000000000000(0000) GS:ffff880028022000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88012b318000, task ffff88012b316000)
Stack:
0000000000000003 000000000000cbe8 000000000000cbf8 000000000000cbf0
<0> 000000000000cbdc 00000000000142c0 0000000000000000 0000000000000003
<0> 0000000000000004 000000000000cbf8 000000000000cbe8 00000000000142c0
Call Trace:
[<ffffffff81664c96>] native_smp_prepare_cpus+0x146/0x3b6
[<ffffffff8165b594>] kernel_init+0x84/0x1db
[<ffffffff8100ca1a>] child_rip+0xa/0x20
[<ffffffff8165b510>] ? kernel_init+0x0/0x1db
[<ffffffff8100ca10>] ? child_rip+0x0/0x20
Code: 00 00 48 89 c2 49 23 85 b0 00 00 00 49 23 96 b0 00 00 00 48 39 c2 75 2b 49 63 c4 48 8b 55 b8 48 8b 04 c5 90 fe 63 81 48 8b 04 02 <f0> 0f ab 18 48 63 c3 48 8b 04 c5 90 fe 63 81 48 8b 04 02 f0 44
RIP  [<ffffffff81328c7b>] set_cpu_sibling_map+0x24f/0x353
RSP <ffff88012b319e20>
CR2: 0000000000000000
---[ end trace 4eaa2a86a8e2da22 ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D    2.6.31-rc4-autotest-next-20090730-5-default #1
Call Trace:
[<ffffffff8132bc59>] panic+0x75/0x120
[<ffffffff8104f41a>] ? exit_ptrace+0x33/0x12b
[<ffffffff810493c0>] do_exit+0x79/0x6c8
[<ffffffff8132f329>] oops_end+0xb3/0xbb
[<ffffffff8102934f>] no_context+0x1ed/0x1fc
[<ffffffff810294f0>] __bad_area_nosemaphore+0x192/0x1b8
[<ffffffff810ac967>] ? __alloc_pages_nodemask+0x118/0x57d
[<ffffffff81029524>] bad_area_nosemaphore+0xe/0x10
[<ffffffff8133077f>] do_page_fault+0x187/0x2c6
[<ffffffff8132e86f>] page_fault+0x1f/0x30
[<ffffffff81328c7b>] ? set_cpu_sibling_map+0x24f/0x353
[<ffffffff81664c96>] native_smp_prepare_cpus+0x146/0x3b6
[<ffffffff8165b594>] kernel_init+0x84/0x1db
[<ffffffff8100ca1a>] child_rip+0xa/0x20
[<ffffffff8165b510>] ? kernel_init+0x0/0x1db
[<ffffffff8100ca10>] ? child_rip+0x0/0x20

The failure points to the following piece of code :

if ((c->phys_proc_id == o->phys_proc_id) &&
   (c->cpu_node_id == o->cpu_node_id)) {
        cpumask_set_cpu(i, cpu_node_mask(cpu)); << ==
        cpumask_set_cpu(cpu, cpu_node_mask(i)); << ==
}


Yesterday's Next tree worked fine. Have attached the boot log.

Thanks
-Sachin

--

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.31-rc4-autotest-next-20090730-5-default (root@mls21b) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Thu Jul 30 14:47:18 IST 2009
Command line: root=/dev/sda1 console=tty0 console=ttyS1,19200 resume=/dev/disk/by-id/scsi-3500000e015c26a80-part2 splash=silent crashkernel=256M-:128M@16M IDENT=1248946314
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
 BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000cffa3900 (usable)
 BIOS-e820: 00000000cffa3900 - 00000000cffa7400 (ACPI data)
 BIOS-e820: 00000000cffa7400 - 00000000d0000000 (reserved)
 BIOS-e820: 00000000f4000000 - 00000000fc000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000130000000 (usable)
DMI 2.4 present.
last_pfn = 0x130000 max_arch_pfn = 0x400000000
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
last_pfn = 0xcffa3 max_arch_pfn = 0x400000000
init_memory_mapping: 0000000000000000-00000000cffa3000
init_memory_mapping: 0000000100000000-0000000130000000
RAMDISK: 3771f000 - 37fef6c7
ACPI: RSDP 00000000000fdfe0 00014 (v00 IBM   )
ACPI: RSDT 00000000cffa7380 00038 (v01 IBM    SERLEWIS 00001000 IBM  45444F43)
ACPI: FACP 00000000cffa72c0 00084 (v02 IBM    SERLEWIS 00001000 IBM  45444F43)
ACPI: DSDT 00000000cffa3900 036CE (v01 IBM    SERLEWIS 00001000 INTL 20060912)
ACPI: FACS 00000000cffa7040 00040
ACPI: APIC 00000000cffa7200 00090 (v01 IBM    SERLEWIS 00001000 IBM  45444F43)
ACPI: SRAT 00000000cffa7100 000E8 (v01 AMD    HAMMER   00000001 AMD  00000001)
ACPI: HPET 00000000cffa70c0 00038 (v01 IBM    SERLEWIS 00001000 IBM  45444F43)
ACPI: MCFG 00000000cffa7080 0003C (v01 IBM    SERLEWIS 00001000 IBM  45444F43)
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
SRAT: Node 0 PXM 0 0-a0000
SRAT: Node 0 PXM 0 100000-d0000000
SRAT: Node 0 PXM 0 100000000-130000000
Bootmem setup node 0 0000000000000000-0000000130000000
  NODE_DATA [000000000000f640 - 000000000004363f]
  bootmap [0000000000044000 -  0000000000069fff] pages 26
(9 early reservations) ==> bootmem [0000000000 - 0130000000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
  #2 [0001000000 - 0005d186b4]    TEXT DATA BSS ==> [0001000000 - 0005d186b4]
  #3 [003771f000 - 0037fef6c7]          RAMDISK ==> [003771f000 - 0037fef6c7]
  #4 [000009c000 - 0000100000]    BIOS reserved ==> [000009c000 - 0000100000]
  #5 [0005d19000 - 0005d192d0]              BRK ==> [0005d19000 - 0005d192d0]
  #6 [0000008000 - 000000c000]          PGTABLE ==> [0000008000 - 000000c000]
  #7 [000000c000 - 000000d000]          PGTABLE ==> [000000c000 - 000000d000]
  #8 [000000d000 - 000000f640]       MEMNODEMAP ==> [000000d000 - 000000f640]
found SMP MP-table at [ffff88000009c140] 9c140
crashkernel reservation failed - memory is in use
Zone PFN ranges:
  DMA      0x00000000 -> 0x00001000
  DMA32    0x00001000 -> 0x00100000
  Normal   0x00100000 -> 0x00130000
Movable zone start PFN for each node
early_node_map[3] active PFN ranges
    0: 0x00000000 -> 0x0000009c
    0: 0x00000100 -> 0x000cffa3
    0: 0x00100000 -> 0x00130000
Detected use of extended apic ids on hypertransport bus
Detected use of extended apic ids on hypertransport bus
ACPI: PM-Timer IO Port: 0x488
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15
ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[16])
IOAPIC[1]: apic_id 13, version 17, address 0xfec02000, GSI 16-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x1166a201 base: 0xfed00000
SMP: Allowing 4 CPUs, 0 hotplug CPUs
PM: Registered nosave memory: 000000000009c000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
PM: Registered nosave memory: 00000000cffa3000 - 00000000cffa4000
PM: Registered nosave memory: 00000000cffa4000 - 00000000cffa7000
PM: Registered nosave memory: 00000000cffa7000 - 00000000cffa8000
PM: Registered nosave memory: 00000000cffa8000 - 00000000d0000000
PM: Registered nosave memory: 00000000d0000000 - 00000000f4000000
PM: Registered nosave memory: 00000000f4000000 - 00000000fc000000
PM: Registered nosave memory: 00000000fc000000 - 00000000fec00000
PM: Registered nosave memory: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at d0000000 (gap: d0000000:24000000)
NR_CPUS:4096 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:2
PERCPU: Embedded 28 pages at ffff880028022000, static data 85408 bytes
Built 1 zonelists in Node order, mobility grouping on.  Total pages: 1031249
Policy zone: Normal
Kernel command line: root=/dev/sda1 console=tty0 console=ttyS1,19200 resume=/dev/disk/by-id/scsi-3500000e015c26a80-part2 splash=silent crashkernel=256M-:128M@16M IDENT=1248946314
PID hash table entries: 4096 (order: 12, 32768 bytes)
Initializing CPU#0
Checking aperture...
No AGP bridge found
Node 0: aperture @ f4000000 size 64 MB
Node 1: aperture @ f4000000 size 64 MB
Memory: 4045260k/4980736k available (3281k kernel code, 787204k absent, 148272k reserved, 3170k data, 1360k init)
start_kernel(): bug: interrupts were enabled *very* early, fixing it
Hierarchical RCU implementation.
NR_IRQS:4352
Fast TSC calibration using PIT
Detected 2199.723 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
console [ttyS1] enabled
allocated 41943040 bytes of page_cgroup
please try 'cgroup_disable=memory' option if you don't want memory cgroups
HPET: 3 timers in total, 0 timers will be used for per-cpu timer
Calibrating delay loop (skipped), value calculated using timer frequency.. 4399.44 BogoMIPS (lpj=8798892)
Security Framework initialized
SELinux:  Disabled at boot.
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Mount-cache hash table entries: 256
Initializing cgroup subsys ns
Initializing cgroup subsys cpuacct
Initializing cgroup subsys memory
Initializing cgroup subsys devices
Initializing cgroup subsys freezer
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0/0x0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Node ID: 0
CPU: Processor Core ID: 0
mce: CPU supports 5 MCE banks
using C1E aware idle routine
Performance Counters: AMD PMU driver.
... version:                 0
... bit width:               48
... generic counters:        4
... value mask:              0000ffffffffffff
... max period:              00007fffffffffff
... fixed-purpose counters:  0
... counter mask:            000000000000000f
ACPI: Core revision 20090625
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81328c7b>] set_cpu_sibling_map+0x24f/0x353
PGD 0 
Oops: 0002 [#1] SMP 
last sysfs file: 
CPU 0 
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.31-rc4-autotest-next-20090730-5-default #1 BladeCenter LS21 -[79716AA]-
RIP: 0010:[<ffffffff81328c7b>]  [<ffffffff81328c7b>] set_cpu_sibling_map+0x24f/0x353
RSP: 0018:ffff88012b319e20  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000cbdc
RDX: 000000000000cbf0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88012b319e80 R08: 0000000000000004 R09: ffff880028092bc0
R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
R13: ffff8800280362c0 R14: ffff8800280362c0 R15: 00000000000142c0
FS:  0000000000000000(0000) GS:ffff880028022000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88012b318000, task ffff88012b316000)
Stack:
 0000000000000003 000000000000cbe8 000000000000cbf8 000000000000cbf0
<0> 000000000000cbdc 00000000000142c0 0000000000000000 0000000000000003
<0> 0000000000000004 000000000000cbf8 000000000000cbe8 00000000000142c0
Call Trace:
 [<ffffffff81664c96>] native_smp_prepare_cpus+0x146/0x3b6
 [<ffffffff8165b594>] kernel_init+0x84/0x1db
 [<ffffffff8100ca1a>] child_rip+0xa/0x20
 [<ffffffff8165b510>] ? kernel_init+0x0/0x1db
 [<ffffffff8100ca10>] ? child_rip+0x0/0x20
Code: 00 00 48 89 c2 49 23 85 b0 00 00 00 49 23 96 b0 00 00 00 48 39 c2 75 2b 49 63 c4 48 8b 55 b8 48 8b 04 c5 90 fe 63 81 48 8b 04 02 <f0> 0f ab 18 48 63 c3 48 8b 04 c5 90 fe 63 81 48 8b 04 02 f0 44 
RIP  [<ffffffff81328c7b>] set_cpu_sibling_map+0x24f/0x353
 RSP <ffff88012b319e20>
CR2: 0000000000000000
---[ end trace 4eaa2a86a8e2da22 ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D    2.6.31-rc4-autotest-next-20090730-5-default #1
Call Trace:
 [<ffffffff8132bc59>] panic+0x75/0x120
 [<ffffffff8104f41a>] ? exit_ptrace+0x33/0x12b
 [<ffffffff810493c0>] do_exit+0x79/0x6c8
 [<ffffffff8132f329>] oops_end+0xb3/0xbb
 [<ffffffff8102934f>] no_context+0x1ed/0x1fc
 [<ffffffff810294f0>] __bad_area_nosemaphore+0x192/0x1b8
 [<ffffffff810ac967>] ? __alloc_pages_nodemask+0x118/0x57d
 [<ffffffff81029524>] bad_area_nosemaphore+0xe/0x10
 [<ffffffff8133077f>] do_page_fault+0x187/0x2c6
 [<ffffffff8132e86f>] page_fault+0x1f/0x30
 [<ffffffff81328c7b>] ? set_cpu_sibling_map+0x24f/0x353
 [<ffffffff81664c96>] native_smp_prepare_cpus+0x146/0x3b6
 [<ffffffff8165b594>] kernel_init+0x84/0x1db
 [<ffffffff8100ca1a>] child_rip+0xa/0x20
 [<ffffffff8165b510>] ? kernel_init+0x0/0x1db
 [<ffffffff8100ca10>] ? child_rip+0x0/0x20


[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux