> I disagree. The patch just covers over the problem by making the > race substantially less likely to be lost. > You must remember that the raid5d could be running at any time, > especially on an SMP machine. So there much be some sort of locking > to make sure that rdev->sb is not freed while the super blocks are > being written out. > Yes, I see it. > I have a patch, included below, which I think addresses the problem > most thoroughly. I would appreciate comments. > Aplied. Going to test now ... > A particular feature is that rdev->sb is *not* freed wheneven ->faulty > is set to 1. Rather, the free is delayed until md_update_db is > called, which will be shortly after. This goes a long way to curing > the race. > > I have lock/unlock_mddev at several places, so that any reconfiguring > of the mddev, is all under the reconfig_sem semaphore. > Test results: It seems, that something is not still quite OK. I've got OOPS after raidsetfaulty this time, but different than before (it looks more sublte now). After reboot and another raidsetfaulty command there was no OOPS but after activating the disk reconstruction seems stucked (now its over 15 min on 500Mb partition): md2 : active raid5 hdi3[3] hdg3[1] hde3[0] 1076224 blocks level 5, 8k chunk, algorithm 0 [3/2] [UU_] [>....................] recovery = 0.0% (0/538112) finish=72734.8min speed=0K/sec Can I test with something more ? I'm including log of procedure here: Feb 28 15:48:48 temp kernel: raid5: Disk failure on hdi3, disabling device. Operation continuing on 2 devices Feb 28 15:48:48 temp kernel: Unable to handle kernel paging request at virtual address a92b4efc Feb 28 15:48:48 temp kernel: printing eip: Feb 28 15:48:48 temp kernel: c0116c0c Feb 28 15:48:48 temp kernel: *pde = 00000000 Feb 28 15:48:48 temp kernel: Oops: 0002 Feb 28 15:48:48 temp kernel: CPU: 1 Feb 28 15:48:48 temp kernel: EIP: 0010:[add_wait_queue_exclusive+40/52] Not tainted Feb 28 15:48:48 temp kernel: EFLAGS: 00010006 Feb 28 15:48:48 temp kernel: eax: dfe57ffc ebx: a92b4efc ecx: 00000202 edx: dfe57f78 Feb 28 15:48:48 temp kernel: esi: dfe57f70 edi: dfe57ff8 ebp: 00000001 esp: dfe57f54 Feb 28 15:48:48 temp kernel: ds: 0018 es: 0018 ss: 0018 Feb 28 15:48:48 temp kernel: Process raid5d (pid: 9, stackpage=dfe57000) Feb 28 15:48:48 temp kernel: Stack: dfe57ff0 dfe57f70 dfe56000 c0105cb9 dfe56000 c183e000 dff70ce0 00000001 Feb 28 15:48:48 temp kernel: dfe56000 dfe57ffc a92b4efc <6>md: recovery thread got woken up ... Feb 28 15:48:48 temp kernel: c0105e80 dfe57ff0 c01d3478 <6>md: updating md2 RAID superblock on device Feb 28 15:48:48 temp kernel: md: (skipping faulty dfe59ec0 hdi3 ) Feb 28 15:48:48 temp kernel: md: hdg3 [events: 00000087]c01d4618 Feb 28 15:48:48 temp kernel: dfe56000 dfe56000 dff70ce0 00000001 00000000 md: bug in file md.c, line 903 Feb 28 15:48:48 temp kernel: Feb 28 15:48:48 temp kernel: md:^I********************************** Feb 28 15:48:48 temp kernel: c1808000 dfe59ec0 c01da7ec Feb 28 15:48:48 temp kernel: Call Trace: [__down+65/200] [__down_failed+8/12] [raid5d+0/392] [_text_lock_raid5+277/381] [md_thread+332/432] Feb 28 15:48:48 temp kernel: [kernel_thread+39/56] Feb 28 15:48:48 temp kernel: Feb 28 15:48:48 temp kernel: Code: 89 13 c6 07 01 51 9d 5b 5e 5f c3 90 53 89 c3 9c 58 fa f0 fe Feb 28 15:48:48 temp kernel: md:^I* <COMPLETE RAID STATE PRINTOUT> * Feb 28 15:48:48 temp kernel: md:^I********************************** Feb 28 15:48:48 temp kernel: md0: <hdi1><hdg1><hde1><hda1> array superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<792b53e7.ca5bb1f3.c6632d09.1c82aa64> CT:3c57f1b6 Feb 28 15:48:48 temp kernel: md: L1 S00056128 ND:5 RD:4 md0 LO:0 CS:4096 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:4 WD:4 FD:1 SD:0 CSUM:486f4e27 E:00000073 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde1(33,1),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg1(34,1),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi1(56,1),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,hda1(3,1),R:3,S:6> Feb 28 15:48:48 temp kernel: D 4: DISK<N:4,[dev 00:00](0,0),R:4,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:2,hdi1(56,1),R:2,S:6> Feb 28 15:48:48 temp kernel: md: rdev hdi1: O:hdi1, SZ:00056128 F:0 DN:2 <6>md: rdev superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<792b53e7.ca5bb1f3.c6632d09.1c82aa64> CT:3c57f1b6 Feb 28 15:48:48 temp kernel: md: L1 S00056128 ND:5 RD:4 md0 LO:0 CS:4096 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:4 WD:4 FD:1 SD:0 CSUM:486f4e5d E:00000073 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde1(33,1),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg1(34,1),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi1(56,1),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,hda1(3,1),R:3,S:6> Feb 28 15:48:48 temp kernel: D 4: DISK<N:4,[dev 00:00](0,0),R:4,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:2,hdi1(56,1),R:2,S:6> Feb 28 15:48:48 temp kernel: md: rdev hdg1: O:hdg1, SZ:00056128 F:0 DN:1 <6>md: rdev superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<792b53e7.ca5bb1f3.c6632d09.1c82aa64> CT:3c57f1b6 Feb 28 15:48:48 temp kernel: md: L1 S00056128 ND:5 RD:4 md0 LO:0 CS:4096 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:4 WD:4 FD:1 SD:0 CSUM:486f4e45 E:00000073 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde1(33,1),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg1(34,1),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi1(56,1),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,hda1(3,1),R:3,S:6> Feb 28 15:48:48 temp kernel: D 4: DISK<N:4,[dev 00:00](0,0),R:4,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:1,hdg1(34,1),R:1,S:6> Feb 28 15:48:48 temp kernel: md: rdev hde1: O:hde1, SZ:00056128 F:0 DN:0 <6>md: rdev superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<792b53e7.ca5bb1f3.c6632d09.1c82aa64> CT:3c57f1b6 Feb 28 15:48:48 temp kernel: md: L1 S00056128 ND:5 RD:4 md0 LO:0 CS:4096 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:4 WD:4 FD:1 SD:0 CSUM:486f4e42 E:00000073 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde1(33,1),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg1(34,1),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi1(56,1),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,hda1(3,1),R:3,S:6> Feb 28 15:48:48 temp kernel: D 4: DISK<N:4,[dev 00:00](0,0),R:4,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:0,hde1(33,1),R:0,S:6> Feb 28 15:48:48 temp kernel: md: rdev hda1: O:hda1, SZ:00056128 F:0 DN:3 <6>md: rdev superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<792b53e7.ca5bb1f3.c6632d09.1c82aa64> CT:3c57f1b6 Feb 28 15:48:48 temp kernel: md: L1 S00056128 ND:5 RD:4 md0 LO:0 CS:4096 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:4 WD:4 FD:1 SD:0 CSUM:486f4e2a E:00000073 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde1(33,1),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg1(34,1),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi1(56,1),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,hda1(3,1),R:3,S:6> Feb 28 15:48:48 temp kernel: D 4: DISK<N:4,[dev 00:00](0,0),R:4,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:3,hda1(3,1),R:3,S:6> Feb 28 15:48:48 temp kernel: md1: <hdi2><hdg2><hde2> array superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<f8b9f7cd.d7f35a10.0c3e0ce8.9f940d45> CT:3c59659e Feb 28 15:48:48 temp kernel: md: L5 S03076352 ND:3 RD:3 md1 LO:0 CS:32768 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:3 WD:3 FD:0 SD:0 CSUM:9eb1d669 E:00000054 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde2(33,2),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg2(34,2),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi2(56,2),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,[dev 00:00](0,0),R:3,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:2,hdi2(56,2),R:2,S:6> Feb 28 15:48:48 temp kernel: md: rdev hdi2: O:hdi2, SZ:03076352 F:0 DN:2 <6>md: rdev superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<f8b9f7cd.d7f35a10.0c3e0ce8.9f940d45> CT:3c59659e Feb 28 15:48:48 temp kernel: md: L5 S03076352 ND:3 RD:3 md1 LO:0 CS:32768 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:3 WD:3 FD:0 SD:0 CSUM:9eb1d69e E:00000054 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde2(33,2),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg2(34,2),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi2(56,2),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,[dev 00:00](0,0),R:3,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:2,hdi2(56,2),R:2,S:6> Feb 28 15:48:48 temp kernel: md: rdev hdg2: O:hdg2, SZ:03076352 F:0 DN:1 <6>md: rdev superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<f8b9f7cd.d7f35a10.0c3e0ce8.9f940d45> CT:3c59659e Feb 28 15:48:48 temp kernel: md: L5 S03076352 ND:3 RD:3 md1 LO:0 CS:32768 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:3 WD:3 FD:0 SD:0 CSUM:9eb1d686 E:00000054 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde2(33,2),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg2(34,2),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi2(56,2),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,[dev 00:00](0,0),R:3,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:1,hdg2(34,2),R:1,S:6> Feb 28 15:48:48 temp kernel: md: rdev hde2: O:hde2, SZ:03076352 F:0 DN:0 <6>md: rdev superblock: Feb 28 15:48:48 temp kernel: md: SB: (V:0.90.0) ID:<f8b9f7cd.d7f35a10.0c3e0ce8.9f940d45> CT:3c59659e Feb 28 15:48:48 temp kernel: md: L5 S03076352 ND:3 RD:3 md1 LO:0 CS:32768 Feb 28 15:48:48 temp kernel: md: UT:3c7e42ef ST:0 AD:3 WD:3 FD:0 SD:0 CSUM:9eb1d683 E:00000054 Feb 28 15:48:48 temp kernel: D 0: DISK<N:0,hde2(33,2),R:0,S:6> Feb 28 15:48:48 temp kernel: D 1: DISK<N:1,hdg2(34,2),R:1,S:6> Feb 28 15:48:48 temp kernel: D 2: DISK<N:2,hdi2(56,2),R:2,S:6> Feb 28 15:48:48 temp kernel: D 3: DISK<N:3,[dev 00:00](0,0),R:3,S:9> Feb 28 15:48:48 temp kernel: md: THIS: DISK<N:0,hde2(33,2),R:0,S:6> And OOPS: Oops: 0002 CPU: 1 EIP: 0010:[add_wait_queue_exclusive+40/52] Not tainted EFLAGS: 00010006 eax: dfe57ffc ebx: a92b4efc ecx: 00000202 edx: dfe57f78 esi: dfe57f70 edi: dfe57ff8 ebp: 00000001 esp: dfe57f54 ds: 0018 es: 0018 ss: 0018 Process raid5d (pid: 9, stackpage=dfe57000) Stack: dfe57ff0 dfe57f70 dfe56000 c0105cb9 dfe56000 c183e000 dff70ce0 00000001 dfe56000 dfe57ffc a92b4efc c0105e80 dfe57ff0 c01d3478 c01d4618 dfe56000 dfe56000 dff70ce0 00000001 00000000 c1808000 dfe59ec0 c01da7ec Call Trace: [__down+65/200] [__down_failed+8/12] [raid5d+0/392] [_text_lock_raid5+277/381] [md_thread+332/432] Code: 89 13 c6 07 01 51 9d 5b 5e 5f c3 90 53 89 c3 9c 58 fa f0 fe Using defaults from ksymoops -t elf32-i386 -a i386 Code; 00000000 Before first symbol 0000000000000000 <_EIP>: Code; 00000000 Before first symbol 0: 89 13 mov %edx,(%ebx) Code; 00000002 Before first symbol 2: c6 07 01 movb $0x1,(%edi) Code; 00000004 Before first symbol 5: 51 push %ecx Code; 00000006 Before first symbol 6: 9d popf Code; 00000006 Before first symbol 7: 5b pop %ebx Code; 00000008 Before first symbol 8: 5e pop %esi Code; 00000008 Before first symbol 9: 5f pop %edi Code; 0000000a Before first symbol a: c3 ret Code; 0000000a Before first symbol b: 90 nop Code; 0000000c Before first symbol c: 53 push %ebx Code; 0000000c Before first symbol d: 89 c3 mov %eax,%ebx Code; 0000000e Before first symbol f: 9c pushf Code; 00000010 Before first symbol 10: 58 pop %eax Code; 00000010 Before first symbol 11: fa cli Code; 00000012 Before first symbol 12: f0 fe 00 lock incb (%eax) 1 warning and 2 errors issued. Results may not be reliable. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html