Hi! Here this goes again. For 2.5.6-pre1 please look at: http://marc.theaimsgroup.com/?l=linux-raid&m=101484845527913&w=2 [1.] One line summary of the problem: Shutdown left (always) software raid array in unclean mode, next shutdown (during reconstructing) will cause kernel bug. [2.] Full description of the problem/report: With 2.5.6 (and 2.5.6-pre1 at least) shutdown procedure left raid array in unclean state. Next time when machine is going down (during reconstruction) there will be kernel bug in the raid1.c:656. This is completely reproductable. If you run shutdown after reconstruction has been finished, there is no problem. However raid array is again left in unclean mode. unmounting file systems kernel BUG at raid1.c:656 invalid operand: 0000 CPU: 0 EIP: 0010:[<c01dd40b>] Not tainted EFLAGS: 0010293 eax: cfe73c18 ebx: cfe73800 ecx: 00000001 edx: 00000003 esi: cfe16000 edi: cde73800 ebp: cfe17fe0 esp: cfe17f84 ds: 0018 es: 0018 ss: 0018 Process raidsyncd (pid: 10, threadinfo=cfe16000 task=c135d6e0) Stack: cfe73800 c136e500 c136e558 cfe17fe0 0000222a 0000176b 00001899 c01ddeba c01dded1 cfe73800 cfe16000 cfe1c160 cfe1c168 c01e2aea cfe73800 00000100 c137dea4 cfe1c160 cfe28000 00000000 c135d6e0 00000000 00000000 00000000 Call Trace: [<c01ddeba>][<c01dded1>][<c01e2aea>][<c0105614>] Code: 0f 0b 90 02 8f 93 26 c0 5b 5e 5f 5d 83 c4 10 c3 90 81 ec a0 [3.] Keywords: kernel bug, raid (raid1), md [4.] Kernel version (from /proc/version): Linux version 2.5.6 (root@eowyn.sillanpaa.jyu.fi) (gcc version 2.95.3 20010315 (release)) #3 Sun Mar 10 14:20:04 EET 2002 [5.] Output of Oops.. message ksymoops 0.7c on i586 2.5.6. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.5.6/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Warning (compare_maps): ksyms_base symbol idle_cpu_R__ver_idle_cpu not found in System.map. Ignoring ksyms_base entry Warning (compare_maps): mismatch on symbol partition_name , ksyms_base says c01deaa0, System.map says c0151b00. Ignoring ksyms_base entry Warning (compare_maps): ksyms_base symbol vmalloc_to_page_R__ver_vmalloc_to_page not found in System.map. Ignoring ksyms_base entry invalid operand: 0000 CPU: 0 EIP: 0010:[<c01dd40b>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 0010293 eax: cfe73c18 ebx: cfe73800 ecx: 00000001 edx: 00000003 esi: cfe16000 edi: cde73800 ebp: cfe17fe0 esp: cfe17f84 ds: 0018 es: 0018 ss: 0018 Stack: cfe73800 c136e500 c136e558 cfe17fe0 0000222a 0000176b 00001899 c01ddeba c01dded1 cfe73800 cfe16000 cfe1c160 cfe1c168 c01e2aea cfe73800 00000100 c137dea4 cfe1c160 cfe28000 00000000 c135d6e0 00000000 00000000 00000000 Call Trace: [<c01ddeba>][<c01dded1>][<c01e2aea>][<c0105614>] Code: 0f 0b 90 02 8f 93 26 c0 5b 5e 5f 5d 83 c4 10 c3 90 81 ec a0 >>EIP; c01dd40b <close_sync+ff/110> <===== Trace; c01ddeba <raid1syncd+2e/58> Trace; c01dded1 <raid1syncd+45/58> Trace; c01e2aea <md_thread+15e/1c4> Trace; c0105614 <kernel_thread+28/38> Code; c01dd40b <close_sync+ff/110> 00000000 <_EIP>: Code; c01dd40b <close_sync+ff/110> <===== 0: 0f 0b ud2a <===== Code; c01dd40d <close_sync+101/110> 2: 90 nop Code; c01dd40e <close_sync+102/110> 3: 02 8f 93 26 c0 5b add 0x5bc02693(%edi),%cl Code; c01dd414 <close_sync+108/110> 9: 5e pop %esi Code; c01dd415 <close_sync+109/110> a: 5f pop %edi Code; c01dd416 <close_sync+10a/110> b: 5d pop %ebp Code; c01dd417 <close_sync+10b/110> c: 83 c4 10 add $0x10,%esp Code; c01dd41a <close_sync+10e/110> f: c3 ret Code; c01dd41b <close_sync+10f/110> 10: 90 nop Code; c01dd41c <diskop+0/518> 11: 81 ec a0 00 00 00 sub $0xa0,%esp 4 warnings issued. Results may not be reliable. [6.] A small shell script or example program which triggers the problem (if possible) power on -> shutdown -> restart -> shutdown -> kernel bug [7.2.] Processor information (from /proc/cpuinfo): processor : 0 vendor_id : GenuineIntel cpu family : 5 model : 4 model name : Pentium MMX stepping : 3 cpu MHz : 199.907 fdiv_bug : no hlt_bug : no f00f_bug : yes coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr mce cx8 mmx bogomips : 398.95 [7.3.] Module information (from /proc/modules): snd-ens1371 10720 0 (autoclean) snd-pcm 48832 0 (autoclean) [snd-ens1371] snd-timer 10128 0 (autoclean) [snd-pcm] snd-rawmidi 12496 0 (autoclean) [snd-ens1371] snd-ac97-codec 22416 0 (autoclean) [snd-ens1371] snd 28048 0 (autoclean) [snd-ens1371 snd-pcm snd-timer snd-rawmidi snd-ac97-codec] soundcore 3984 0 (autoclean) [snd] 3c509 7024 1 (autoclean) 3c59x 25216 1 (autoclean) ipt_MASQUERADE 1552 1 ipt_state 864 18 ipt_LOG 3600 23 iptable_nat 16208 1 [ipt_MASQUERADE] ip_conntrack 16080 2 [ipt_MASQUERADE ipt_state iptable_nat] iptable_filter 2016 1 iptable_mangle 2432 1 ip_tables 11104 8 [ipt_MASQUERADE ipt_state ipt_LOG iptable_nat iptable_filter iptable_mangle] [7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem): 0000-001f : dma1 0020-003f : pic1 0040-005f : timer 0060-006f : keyboard 0080-008f : dma page reg 00a0-00bf : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial(auto) 0300-030f : 3c509 0376-0376 : ide1 03c0-03df : vga+ 03c0-03df : matrox 03f6-03f6 : ide0 03f8-03ff : serial(auto) 0cf8-0cff : PCI conf1 d000-d03f : Ensoniq 5880 AudioPCI d000-d03f : Ensoniq AudioPCI d400-d43f : 3Com Corporation 3c900 Combo [Boomerang] d400-d43f : 00:09.0 d800-d81f : Intel Corp. 82371AB PIIX4 USB e000-e00f : Intel Corp. 82371AB PIIX4 IDE e000-e007 : ide0 e008-e00f : ide1 e400-e43f : Intel Corp. 82371AB PIIX4 ACPI e800-e81f : Intel Corp. 82371AB PIIX4 ACPI 00000000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c7fff : Video ROM 000f0000-000fffff : System ROM 00100000-0fffffff : System RAM 00100000-002301dc : Kernel code 002301dd-002a05f3 : Kernel data e5800000-e58fffff : Sigma Designs, Inc. REALmagic Hollywood Plus DVD Decoder e6000000-e6003fff : Matrox Graphics, Inc. MGA 2064W [Millennium] e6000000-e6003fff : matroxfb MMIO e7000000-e77fffff : Matrox Graphics, Inc. MGA 2064W [Millennium] e7000000-e77fffff : matroxfb FB ffff0000-ffffffff : reserved [7.5.] PCI information ('lspci -vvv' as root): 00:00.0 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- Latency: 32 set 00:01.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 01) Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 set 00:01.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master]) Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 set Region 4: I/O ports at e000 [size=16] 00:01.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00 [UHCI]) Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 set Interrupt: pin D routed to IRQ 9 Region 4: I/O ports at d800 [size=32] 00:01.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 01) Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Interrupt: pin ? routed to IRQ 9 00:09.0 Ethernet controller: 3Com Corporation 3c900 Combo [Boomerang] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 3 min, 8 max, 32 set Interrupt: pin A routed to IRQ 9 Region 0: I/O ports at d400 [size=64] Expansion ROM at <unassigned> [disabled] [size=64K] 00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2064W [Millennium] (rev 01) (prog-if 00 [VGA]) Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Interrupt: pin A routed to IRQ 9 Region 0: Memory at e6000000 (32-bit, non-prefetchable) [size=16K] Region 1: Memory at e7000000 (32-bit, prefetchable) [size=8M] Expansion ROM at <unassigned> [disabled] [size=64K] 00:0b.0 Multimedia controller: Sigma Designs, Inc. REALmagic Hollywood Plus DVD Decoder (rev 02) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 set Interrupt: pin A routed to IRQ 10 Region 0: Memory at e5800000 (32-bit, non-prefetchable) [size=1M] Capabilities: [40] Power Management version 1 Flags: PMEClk- AuxPwr- DSI- D1- D2- PME- Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:0c.0 Multimedia audio controller: Ensoniq: Unknown device 5880 (rev 02) Subsystem: Ensoniq: Unknown device 2000 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 12 min, 128 max, 32 set Interrupt: pin A routed to IRQ 11 Region 0: I/O ports at d000 [size=64] Capabilities: [dc] Power Management version 1 Flags: PMEClk- AuxPwr- DSI+ D1- D2+ PME- Status: D0 PME-Enable- DSel=0 DScale=0 PME- [7.6.] SCSI information (from /proc/scsi/scsi): Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: LITE-ON Model: LTR-16102B Rev: OS09 Type: CD-ROM ANSI SCSI revision: 02 [7.7] mdstat just after second shutdown (this will cause kernel bug) Personalities : [raid1] md0 : active raid1 hdc1[1] hda13[0] 1228864 blocks [2/2] [UU] [==>..................] resync = 14.2% (175424/1228864) finish=6.1min speed=2843K/sec unused devices: <none> [X.] If you need futher information, or need tester for patch, please don't hesitate to ask! BR, Jani -- Jani Averbach jaa@iki.fi +358 40 759 0984 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html