[Hotplug_sig] Bug in CPU Hotplug on x86

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 17, 2006 at 01:50:01PM -0800, Raj, Ashok wrote:
> Which kernel version does this happen? Or doesn't it matter?

This particular test was run on 2.6.16-r3-mm1, however I have seen
this issue on other 2.6.16-r* kernels as well.

Bryce

> >-----Original Message-----
> >From: hotplug_sig-bounces@xxxxxxxxxxxxxx
> [mailto:hotplug_sig-bounces@xxxxxxxxxxxxxx] On Behalf Of
> >Bryce Harrington
> >Sent: Friday, February 17, 2006 1:39 PM
> >To: hotplug_sig@xxxxxxxxxxxxxx
> >Subject: [Hotplug_sig] Bug in CPU Hotplug on x86
> >
> >Hi Martine and Mary,
> >
> >I've gotten the hotplug cpu test working and automated now on an x86
> >box.  :-)
> >
> >I've also already found a bug in CPU hotplug on this platform.  It
> looks
> >like a legitimate issue, that Hotplug SIG can report to the developers,
> >but I wanted to run it by folks on this list first.  Can you review
> this
> >and let me know if it should be reported?  And who would be the best
> >person to show this bug report to?
> >
> >
> >This fault occurs on the first hotplug test.  This test attempts to
> >offline and then online each of the CPU's.  It is failing when onlining
> >the CPU that it just offlined; this results in a system lockup
> >(requiring power cycling).
> >
> >
> >Here is the output from hotplug01.sh:
> >
> >  Name:   hotplug01
> >  Date:   Wed Feb 15 12:19:16 PST 2006
> >  Desc:   What happens to disk controller interrupts when offlining
> CPUs?
> >
> >  CPU is 0
> >  Starting loop '1'
> >  offlining cpu1:  OK
> >  offlining cpu0:  OK
> >  onlining cpu1:  OK
> >
> >At this point the system locks up.
> >
> >
> >During the test run, I'm seeing the following output from dmesg:
> >
> > Breaking affinity for irq 0
> > CPU 1 is now offline
> > Booting processor 1/0 eip 2000
> > CPU 1 irqstacks, hard=c04e3000 soft=c04c3000
> > Initializing CPU#1
> > Calibrating delay using timer specific routine.. 1733.57 BogoMIPS
> (lpj=3467154)
> > CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000
> 00000000 00000000 00000000
> > CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000
> 00000000 00000000 00000000
> > CPU: L1 I cache: 16K, L1 D cache: 16K
> > CPU: L2 cache: 256K
> > CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040
> 00000000 00000000 00000000
> > Intel machine check architecture supported.
> > Intel machine check reporting enabled on CPU#1.
> > CPU1: Intel Pentium III (Coppermine) stepping 06
> > APIC error on CPU1: 00(40)
> >
> >
> >This is what /var/log/messages shows:
> >
> >Feb 15 12:19:17 cl009 Breaking affinity for irq 0
> >Feb 15 12:19:17 cl009 CPU 1 is now offline
> >Feb 15 12:19:19 cl009 Booting processor 1/0 eip 2000
> >Feb 15 12:19:19 cl009 CPU 1 irqstacks, hard=c04e3000 soft=c04c3000
> >Feb 15 12:19:19 cl009 Initializing CPU#1
> >Feb 15 12:19:19 cl009 Calibrating delay using timer specific routine..
> 1733.57 BogoMIPS (lpj=3467154)
> >Feb 15 12:19:19 cl009 CPU: After generic identify, caps: 0383fbff
> 00000000 00000000 00000000 00000000
> >00000000 00000000
> >Feb 15 12:19:19 cl009 CPU: After vendor identify, caps: 0383fbff
> 00000000 00000000 00000000 00000000
> >00000000 00000000
> >Feb 15 12:19:19 cl009 CPU: L1 I cache: 16K, L1 D cache: 16K
> >Feb 15 12:19:19 cl009 CPU: L2 cache: 256K
> >Feb 15 12:19:19 cl009 CPU: After all inits, caps: 0383fbff 00000000
> 00000000 00000040 00000000
> >00000000 00000000
> >Feb 15 12:19:19 cl009 Intel machine check architecture supported.
> >Feb 15 12:19:19 cl009 Intel machine check reporting enabled on CPU#1.
> >Feb 15 12:19:19 cl009 CPU1: Intel Pentium III (Coppermine) stepping 06
> >Feb 15 12:19:19 cl009 APIC error on CPU1: 00(40)
> >
> >
> >I've been able to reproduce this error 3 out of 3 times on this
> >particular system.  It is a Pentium III with the following
> >/proc/cpuinfo:
> >
> > processor       : 0
> > vendor_id       : GenuineIntel
> > cpu family      : 6
> > model           : 8
> > model name      : Pentium III (Coppermine)
> > stepping        : 6
> > cpu MHz         : 866.932
> > cache size      : 256 KB
> > fdiv_bug        : no
> > hlt_bug         : no
> > f00f_bug        : no
> > coma_bug        : no
> > fpu             : yes
> > fpu_exception   : yes
> > cpuid level     : 2
> > wp              : yes
> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 mmx fxsr
> >sse
> > bogomips        : 1736.35
> >
> > processor       : 1
> > vendor_id       : GenuineIntel
> > cpu family      : 6
> > model           : 8
> > model name      : Pentium III (Coppermine)
> > stepping        : 6
> > cpu MHz         : 866.932
> > cache size      : 256 KB
> > fdiv_bug        : no
> > hlt_bug         : no
> > f00f_bug        : no
> > coma_bug        : no
> > fpu             : yes
> > fpu_exception   : yes
> > cpuid level     : 2
> > wp              : yes
> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 mmx fxsr
> >sse
> > bogomips        : 1733.57
> >
> >
> >Bryce

[Index of Archives]     [Linux Kernel]     [Linux DVB]     [Asterisk Internet PBX]     [DCCP]     [Netdev]     [X.org]     [Util Linux NG]     [Fedora Women]     [ALSA Devel]     [Linux USB]

  Powered by Linux