hi Bryce, I'm confused, you have a 2 CPU system and you're trying to offline cpu0 after offlining cpu1, on a 2 CPU system this should NOT be allowed for obvious reasons :-) So why does the test reply "OK" to offlining cpuO? Martine -----Original Message----- From: hotplug_sig-bounces@xxxxxxxxxxxxxx [mailto:hotplug_sig-bounces@xxxxxxxxxxxxxx] On Behalf Of Bryce Harrington Sent: Friday, February 17, 2006 4:39 PM To: hotplug_sig@xxxxxxxxxxxxxx Subject: [Hotplug_sig] Bug in CPU Hotplug on x86 Hi Martine and Mary, I've gotten the hotplug cpu test working and automated now on an x86 box. :-) I've also already found a bug in CPU hotplug on this platform. It looks like a legitimate issue, that Hotplug SIG can report to the developers, but I wanted to run it by folks on this list first. Can you review this and let me know if it should be reported? And who would be the best person to show this bug report to? This fault occurs on the first hotplug test. This test attempts to offline and then online each of the CPU's. It is failing when onlining the CPU that it just offlined; this results in a system lockup (requiring power cycling). Here is the output from hotplug01.sh: Name: hotplug01 Date: Wed Feb 15 12:19:16 PST 2006 Desc: What happens to disk controller interrupts when offlining CPUs? CPU is 0 Starting loop '1' offlining cpu1: OK offlining cpu0: OK onlining cpu1: OK At this point the system locks up. During the test run, I'm seeing the following output from dmesg: Breaking affinity for irq 0 CPU 1 is now offline Booting processor 1/0 eip 2000 CPU 1 irqstacks, hard=c04e3000 soft=c04c3000 Initializing CPU#1 Calibrating delay using timer specific routine.. 1733.57 BogoMIPS (lpj=3467154) CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel Pentium III (Coppermine) stepping 06 APIC error on CPU1: 00(40) This is what /var/log/messages shows: Feb 15 12:19:17 cl009 Breaking affinity for irq 0 Feb 15 12:19:17 cl009 CPU 1 is now offline Feb 15 12:19:19 cl009 Booting processor 1/0 eip 2000 Feb 15 12:19:19 cl009 CPU 1 irqstacks, hard=c04e3000 soft=c04c3000 Feb 15 12:19:19 cl009 Initializing CPU#1 Feb 15 12:19:19 cl009 Calibrating delay using timer specific routine.. 1733.57 BogoMIPS (lpj=3467154) Feb 15 12:19:19 cl009 CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000 Feb 15 12:19:19 cl009 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000 Feb 15 12:19:19 cl009 CPU: L1 I cache: 16K, L1 D cache: 16K Feb 15 12:19:19 cl009 CPU: L2 cache: 256K Feb 15 12:19:19 cl009 CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000 Feb 15 12:19:19 cl009 Intel machine check architecture supported. Feb 15 12:19:19 cl009 Intel machine check reporting enabled on CPU#1. Feb 15 12:19:19 cl009 CPU1: Intel Pentium III (Coppermine) stepping 06 Feb 15 12:19:19 cl009 APIC error on CPU1: 00(40) I've been able to reproduce this error 3 out of 3 times on this particular system. It is a Pentium III with the following /proc/cpuinfo: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 6 cpu MHz : 866.932 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1736.35 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 6 cpu MHz : 866.932 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1733.57 Bryce