> > Please show the output of my 'lsdrv' script [1] as your system is now > set up. > # ./raidfail/lsdrv PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02) ├scsi 0:0:0:0 ATA WDC WD20EARX-00P {WD-WCAZAL145223} │└sda 1.82t [8:0] Partitioned (dos) │ ├sda1 4.66g [8:1] ext4 {23b488a2-5a22-487a-a83f-bfa761754617} │ │└Mounted as /dev/sda1 @ /boot │ ├sda2 1.00k [8:2] Partitioned (dos) │ ├sda5 29.80g [8:5] swap {720281cd-d82f-4368-ae44-f68408f28282} │ ├sda6 51.22g [8:6] ext4 {321440e1-4078-4605-9d3d-4419bcb4d618} │ │└Mounted as /dev/sda6 @ /var │ └sda7 1.74t [8:7] ext4 {f52145cc-c13f-4230-89a0-e2a343f956f7} │ └Mounted as /dev/disk/by-uuid/f52145cc-c13f-4230-89a0-e2a343f956f7 @ / ├scsi 1:x:x:x [Empty] ├scsi 2:x:x:x [Empty] ├scsi 3:x:x:x [Empty] ├scsi 4:x:x:x [Empty] └scsi 5:x:x:x [Empty] PCI [ahci] 04:00.0 SATA controller: JMicron Technology Corp. JMB363 SATA/IDE Controller (rev 02) ├scsi 6:0:0:0 ATA ST2000DL004 HD20 {S2H7J9FC302772} │└sdb 1.82t [8:16] Partitioned (dos) │ └sdb1 1.82t [8:17] MD raid5 (4) inactive {44ecd957-d23c-44b1-b664-13437cc40f45} └scsi 7:x:x:x [Empty] PCI [sata_sil24] 07:01.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02) ├scsi 8:0:0:0 ATA SAMSUNG HD204UI {S2H7JD2B105685} │└sdc 1.82t [8:32] Partitioned (dos) │ └sdc1 1.82t [8:33] MD raid5 (4) inactive {44ecd957-d23c-44b1-b664-13437cc40f45} ├scsi 9:x:x:x [Empty] ├scsi 10:0:0:0 ATA SAMSUNG HD204UI {S2H7JD2B105686} │└sdd 1.82t [8:48] Partitioned (dos) │ └sdd1 1.82t [8:49] MD raid5 (4) inactive {44ecd957-d23c-44b1-b664-13437cc40f45} └scsi 11:0:0:0 ATA SAMSUNG HD204UI {S2H7JD2B105687} └sde 1.82t [8:64] Partitioned (dos) └sde1 1.82t [8:65] MD raid5 (4) inactive {44ecd957-d23c-44b1-b664-13437cc40f45} PCI [pata_jmicron] 04:00.1 IDE interface: JMicron Technology Corp. JMB363 SATA/IDE Controller (rev 02) ├scsi 12:0:0:0 LITE-ON DVDRW SHM-165H6S {LITE-ON_DVDRW_SHM-165H6S} │└sr0 1.00g [11:0] Empty/Unknown └scsi 13:x:x:x [Empty] Other Block Devices ├loop0 0.00k [7:0] Empty/Unknown ├loop1 0.00k [7:1] Empty/Unknown ├loop2 0.00k [7:2] Empty/Unknown ├loop3 0.00k [7:3] Empty/Unknown ├loop4 0.00k [7:4] Empty/Unknown ├loop5 0.00k [7:5] Empty/Unknown ├loop6 0.00k [7:6] Empty/Unknown ├loop7 0.00k [7:7] Empty/Unknown ├ram0 64.00m [1:0] Empty/Unknown ├ram1 64.00m [1:1] Empty/Unknown ├ram2 64.00m [1:2] Empty/Unknown ├ram3 64.00m [1:3] Empty/Unknown ├ram4 64.00m [1:4] Empty/Unknown ├ram5 64.00m [1:5] Empty/Unknown ├ram6 64.00m [1:6] Empty/Unknown ├ram7 64.00m [1:7] Empty/Unknown ├ram8 64.00m [1:8] Empty/Unknown ├ram9 64.00m [1:9] Empty/Unknown ├ram10 64.00m [1:10] Empty/Unknown ├ram11 64.00m [1:11] Empty/Unknown ├ram12 64.00m [1:12] Empty/Unknown ├ram13 64.00m [1:13] Empty/Unknown ├ram14 64.00m [1:14] Empty/Unknown └ram15 64.00m [1:15] Empty/Unknown > Your drive with S/N S2H7JD2B105688 seems to be the worst, with > triple-digit pending sectors. This suggests a mismatch between your > drives' error correction time limits and the linux drivers' default > timeout. I'm not sure that I understand this. Wouldn't the drive move a bad sector regardless of the OS timeout? Can you point me to more information on correcting the time limits? The change in device mapping went like this: At Failure --> Now sdc --> sdc sdd (2nd drop, most errors) --> ddrescue to sdb and then unplugged sde (1st drop, low event count) --> sdd sdf --> sde > And a lack of regular scrubbing to clean up pending sectors. > "smartctl -l scterc" for each drive would give useful information. > Anyways, the drive may not be really failing--it has zero relocations. > > If S2H7JD2B105688 was the old /dev/sdd, then it doesn't matter, but > you've now lost the opportunity to correct those sectors. The failed sdd has the serial number S2H7JD2B105688. I still have the drive, it's just unplugged. Running "smartctl -l scterc" produces some interesting results. # smartctl -l scterc /dev/sdb smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-44-generic] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net SCT Error Recovery Control: Read: Disabled Write: Disabled # smartctl -l scterc /dev/sdc smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-44-generic] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net SCT Error Recovery Control: Read: Disabled Write: Disabled l# smartctl -l scterc /dev/sdd smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-44-generic] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net SCT Error Recovery Control: Read: Disabled Write: Disabled # smartctl -l scterc /dev/sde smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-44-generic] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net SCT Error Recovery Control: Read: Disabled Write: Disabled What is going on here? How would error recovery get disabled? > > Phil > > [1] http://github.com/pturmel/lsdrv/ -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html