linux 2.5.6 (and -pre1) left raid array in unclean mode -> kernelbug at raid1.c:656

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi! Here this goes again. For 2.5.6-pre1 please look at:
http://marc.theaimsgroup.com/?l=linux-raid&m=101484845527913&w=2

[1.] One line summary of the problem:

Shutdown left (always) software raid array in unclean mode, next shutdown
(during reconstructing) will cause kernel bug.

[2.] Full description of the problem/report:

With 2.5.6 (and 2.5.6-pre1 at least) shutdown procedure left raid
array in unclean state. Next time when machine is going down (during
reconstruction) there will be kernel bug in the raid1.c:656. This is
completely reproductable.

If you run shutdown after reconstruction has been finished, there is no
problem. However raid array is again left in unclean mode.

unmounting file systems kernel BUG at raid1.c:656
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01dd40b>] Not tainted
EFLAGS: 0010293
eax: cfe73c18 ebx: cfe73800 ecx: 00000001 edx: 00000003
esi: cfe16000 edi: cde73800 ebp: cfe17fe0 esp: cfe17f84
ds: 0018 es: 0018 ss: 0018
Process raidsyncd (pid: 10, threadinfo=cfe16000 task=c135d6e0)
Stack:
cfe73800 c136e500 c136e558 cfe17fe0 0000222a 0000176b 00001899 c01ddeba
c01dded1 cfe73800 cfe16000 cfe1c160 cfe1c168 c01e2aea cfe73800 00000100
c137dea4 cfe1c160 cfe28000 00000000 c135d6e0 00000000 00000000 00000000

Call Trace: [<c01ddeba>][<c01dded1>][<c01e2aea>][<c0105614>]

Code: 0f 0b 90 02 8f 93 26 c0 5b 5e 5f 5d 83 c4 10 c3 90 81 ec a0


[3.] Keywords: kernel bug, raid (raid1), md

[4.] Kernel version (from /proc/version):

Linux version 2.5.6 (root@eowyn.sillanpaa.jyu.fi) (gcc version 2.95.3
20010315 (release)) #3 Sun Mar 10 14:20:04 EET 2002

[5.] Output of Oops.. message

ksymoops 0.7c on i586 2.5.6.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.5.6/ (default)
     -m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Warning (compare_maps): ksyms_base symbol idle_cpu_R__ver_idle_cpu not found in System.map.  Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base says c01deaa0, System.map says c0151b00.  Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol vmalloc_to_page_R__ver_vmalloc_to_page not found in System.map.  Ignoring ksyms_base entry
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01dd40b>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 0010293
eax: cfe73c18 ebx: cfe73800 ecx: 00000001 edx: 00000003
esi: cfe16000 edi: cde73800 ebp: cfe17fe0 esp: cfe17f84
ds: 0018 es: 0018 ss: 0018
Stack:
cfe73800 c136e500 c136e558 cfe17fe0 0000222a 0000176b 00001899 c01ddeba
c01dded1 cfe73800 cfe16000 cfe1c160 cfe1c168 c01e2aea cfe73800 00000100
c137dea4 cfe1c160 cfe28000 00000000 c135d6e0 00000000 00000000 00000000
Call Trace: [<c01ddeba>][<c01dded1>][<c01e2aea>][<c0105614>]
Code: 0f 0b 90 02 8f 93 26 c0 5b 5e 5f 5d 83 c4 10 c3 90 81 ec a0

>>EIP; c01dd40b <close_sync+ff/110>   <=====
Trace; c01ddeba <raid1syncd+2e/58>
Trace; c01dded1 <raid1syncd+45/58>
Trace; c01e2aea <md_thread+15e/1c4>
Trace; c0105614 <kernel_thread+28/38>
Code;  c01dd40b <close_sync+ff/110>
00000000 <_EIP>:
Code;  c01dd40b <close_sync+ff/110>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01dd40d <close_sync+101/110>
   2:   90                        nop
Code;  c01dd40e <close_sync+102/110>
   3:   02 8f 93 26 c0 5b         add    0x5bc02693(%edi),%cl
Code;  c01dd414 <close_sync+108/110>
   9:   5e                        pop    %esi
Code;  c01dd415 <close_sync+109/110>
   a:   5f                        pop    %edi
Code;  c01dd416 <close_sync+10a/110>
   b:   5d                        pop    %ebp
Code;  c01dd417 <close_sync+10b/110>
   c:   83 c4 10                  add    $0x10,%esp
Code;  c01dd41a <close_sync+10e/110>
   f:   c3                        ret
Code;  c01dd41b <close_sync+10f/110>
  10:   90                        nop
Code;  c01dd41c <diskop+0/518>
  11:   81 ec a0 00 00 00         sub    $0xa0,%esp


4 warnings issued.  Results may not be reliable.

[6.] A small shell script or example program which triggers the
     problem (if possible)

power on -> shutdown -> restart -> shutdown -> kernel bug


[7.2.] Processor information (from /proc/cpuinfo):

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 5
model		: 4
model name	: Pentium MMX
stepping	: 3
cpu MHz		: 199.907
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: yes
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr mce cx8 mmx
bogomips	: 398.95


[7.3.] Module information (from /proc/modules):

snd-ens1371            10720   0 (autoclean)
snd-pcm                48832   0 (autoclean) [snd-ens1371]
snd-timer              10128   0 (autoclean) [snd-pcm]
snd-rawmidi            12496   0 (autoclean) [snd-ens1371]
snd-ac97-codec         22416   0 (autoclean) [snd-ens1371]
snd                    28048   0 (autoclean) [snd-ens1371 snd-pcm snd-timer snd-rawmidi snd-ac97-codec]
soundcore               3984   0 (autoclean) [snd]
3c509                   7024   1 (autoclean)
3c59x                  25216   1 (autoclean)
ipt_MASQUERADE          1552   1
ipt_state                864  18
ipt_LOG                 3600  23
iptable_nat            16208   1 [ipt_MASQUERADE]
ip_conntrack           16080   2 [ipt_MASQUERADE ipt_state iptable_nat]
iptable_filter          2016   1
iptable_mangle          2432   1
ip_tables              11104   8 [ipt_MASQUERADE ipt_state ipt_LOG iptable_nat iptable_filter iptable_mangle]

[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem):

0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
02f8-02ff : serial(auto)
0300-030f : 3c509
0376-0376 : ide1
03c0-03df : vga+
  03c0-03df : matrox
03f6-03f6 : ide0
03f8-03ff : serial(auto)
0cf8-0cff : PCI conf1
d000-d03f : Ensoniq 5880 AudioPCI
  d000-d03f : Ensoniq AudioPCI
d400-d43f : 3Com Corporation 3c900 Combo [Boomerang]
  d400-d43f : 00:09.0
d800-d81f : Intel Corp. 82371AB PIIX4 USB
e000-e00f : Intel Corp. 82371AB PIIX4 IDE
  e000-e007 : ide0
  e008-e00f : ide1
e400-e43f : Intel Corp. 82371AB PIIX4 ACPI
e800-e81f : Intel Corp. 82371AB PIIX4 ACPI

00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000f0000-000fffff : System ROM
00100000-0fffffff : System RAM
  00100000-002301dc : Kernel code
  002301dd-002a05f3 : Kernel data
e5800000-e58fffff : Sigma Designs, Inc. REALmagic Hollywood Plus DVD Decoder
e6000000-e6003fff : Matrox Graphics, Inc. MGA 2064W [Millennium]
  e6000000-e6003fff : matroxfb MMIO
e7000000-e77fffff : Matrox Graphics, Inc. MGA 2064W [Millennium]
  e7000000-e77fffff : matroxfb FB
ffff0000-ffffffff : reserved

[7.5.] PCI information ('lspci -vvv' as root):

00:00.0 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01)
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
	Latency: 32 set

00:01.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 01)
	Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 0 set

00:01.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master])
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32 set
	Region 4: I/O ports at e000 [size=16]

00:01.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00 [UHCI])
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32 set
	Interrupt: pin D routed to IRQ 9
	Region 4: I/O ports at d800 [size=32]

00:01.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 01)
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Interrupt: pin ? routed to IRQ 9

00:09.0 Ethernet controller: 3Com Corporation 3c900 Combo [Boomerang]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 3 min, 8 max, 32 set
	Interrupt: pin A routed to IRQ 9
	Region 0: I/O ports at d400 [size=64]
	Expansion ROM at <unassigned> [disabled] [size=64K]

00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2064W [Millennium] (rev 01) (prog-if 00 [VGA])
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Interrupt: pin A routed to IRQ 9
	Region 0: Memory at e6000000 (32-bit, non-prefetchable) [size=16K]
	Region 1: Memory at e7000000 (32-bit, prefetchable) [size=8M]
	Expansion ROM at <unassigned> [disabled] [size=64K]

00:0b.0 Multimedia controller: Sigma Designs, Inc. REALmagic Hollywood Plus DVD Decoder (rev 02)
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32 set
	Interrupt: pin A routed to IRQ 10
	Region 0: Memory at e5800000 (32-bit, non-prefetchable) [size=1M]
	Capabilities: [40] Power Management version 1
		Flags: PMEClk- AuxPwr- DSI- D1- D2- PME-
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0c.0 Multimedia audio controller: Ensoniq: Unknown device 5880 (rev 02)
	Subsystem: Ensoniq: Unknown device 2000
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 12 min, 128 max, 32 set
	Interrupt: pin A routed to IRQ 11
	Region 0: I/O ports at d000 [size=64]
	Capabilities: [dc] Power Management version 1
		Flags: PMEClk- AuxPwr- DSI+ D1- D2+ PME-
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-


[7.6.] SCSI information (from /proc/scsi/scsi):

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: LITE-ON  Model: LTR-16102B       Rev: OS09
  Type:   CD-ROM                           ANSI SCSI revision: 02

[7.7] mdstat just after second shutdown (this will cause kernel bug)

Personalities : [raid1]
md0 : active raid1 hdc1[1] hda13[0]
      1228864 blocks [2/2] [UU]
      [==>..................]  resync = 14.2% (175424/1228864) finish=6.1min speed=2843K/sec
unused devices: <none>

[X.]

If you need futher information, or need tester for patch, please don't
hesitate to ask!

BR, Jani

--
Jani Averbach                 jaa@iki.fi              +358 40 759 0984

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux