Help recovering a raid6 device with a kicked drive (too many)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I had a drive failing today. I had a loose cable when I booted after replacing the failing drive (doh!) so now some of my md devices had an extra failed drive. "Oh well, it'll just rebuild" I foolishly thought.

Of course during the rebuild another drive (sdg12) failed (with read )

During a rebuild one of my raid6 devices failed ("read error not correctable"). I'd like to try putting the device together to copy as much data as possible off. From what I read recently on the list, I think these command would force the raid going again, is that right?

   mdadm -S /dev/md12
mdadm -C -n 7 -l 6 /dev/md12 /dev/sdf12 /dev/sdg12 /dev/sde12 / dev/sdc12 missing missing /dev/sda12

Is there a way to make md not kick the drive again when I try copying the data off?

I've posted first the error and then mdadm -E from each of the devices in the raid below.

Thanks!

   - ask


This was the failure:

ata6: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0x0)
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: (BMDMA stat 0x20)
ata7.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
ata7: EH complete
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: (BMDMA stat 0x20)
ata7.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
ata7: EH complete
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: (BMDMA stat 0x20)
ata7.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
ata7: EH complete
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: (BMDMA stat 0x20)
ata7.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
ata7: EH complete
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: (BMDMA stat 0x20)
ata7.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
ata7: EH complete
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata7.00: (BMDMA stat 0x20)
ata7.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
sd 6:0:0:0: SCSI error: return code = 0x08000002
sdg: Current: sense key: Medium Error
    Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sdg, sector 487466631
raid5:md12: read error not correctable (sector 58595328 on sdg12).
raid5: Disk failure on sdg12, disabling device. Operation continuing on 4 devices
raid5:md12: read error not correctable (sector 58595336 on sdg12).
raid5:md12: read error not correctable (sector 58595344 on sdg12).
raid5:md12: read error not correctable (sector 58595352 on sdg12).
raid5:md12: read error not correctable (sector 58595360 on sdg12).
raid5:md12: read error not correctable (sector 58595368 on sdg12).
raid5:md12: read error not correctable (sector 58595376 on sdg12).
raid5:md12: read error not correctable (sector 58595384 on sdg12).
raid5:md12: read error not correctable (sector 58595392 on sdg12).
raid5:md12: read error not correctable (sector 58595400 on sdg12).



/dev/sdg12:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ab10495a:eed4723d:e1075255:4dc67314
  Creation Time : Tue Apr 18 02:58:51 2006
     Raid Level : raid6
    Device Size : 29302464 (27.95 GiB 30.01 GB)
     Array Size : 146512320 (139.73 GiB 150.03 GB)
   Raid Devices : 7
  Total Devices : 5
Preferred Minor : 12

    Update Time : Fri Dec 15 01:16:37 2006
          State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 2
  Spare Devices : 0
       Checksum : fdcd0960 - correct
         Events : 0.3406784

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8      108        1      active sync   /dev/sdg12

   0     0       8       92        0      active sync   /dev/sdf12
   1     1       8      108        1      active sync   /dev/sdg12
   2     2       8       76        2      active sync   /dev/sde12
   3     3       8       44        3      active sync   /dev/sdc12
   4     4       0        0        4      faulty removed
   5     5       0        0        5      faulty removed
   6     6       8       12        6      active sync   /dev/sda12

/dev/sda12:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : ab10495a:eed4723d:e1075255:4dc67314
  Creation Time : Tue Apr 18 02:58:51 2006
     Raid Level : raid6
    Device Size : 29302464 (27.95 GiB 30.01 GB)
     Array Size : 146512320 (139.73 GiB 150.03 GB)
   Raid Devices : 7
  Total Devices : 6
Preferred Minor : 12

    Update Time : Fri Dec 15 02:53:39 2006
          State : clean
Active Devices : 4
Working Devices : 5
Failed Devices : 3
  Spare Devices : 1
       Checksum : fdcd203c - correct
         Events : 0.3406790

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     6       8       12        6      active sync   /dev/sda12

   0     0       8       92        0      active sync   /dev/sdf12
   1     1       0        0        1      faulty removed
   2     2       8       76        2      active sync   /dev/sde12
   3     3       8       44        3      active sync   /dev/sdc12
   4     4       0        0        4      faulty removed
   5     5       0        0        5      faulty removed
   6     6       8       12        6      active sync   /dev/sda12
   7     7       8       60        7      spare   /dev/sdd12


/dev/sdc12:
          Checksum : fdcd2056 - correct

         Number   Major   Minor   RaidDevice State
   this     3       8       44        3      active sync   /dev/sdc12

      0     0       8       92        0      active sync   /dev/sdf12
      1     1       0        0        1      faulty removed
      2     2       8       76        2      active sync   /dev/sde12
      3     3       8       44        3      active sync   /dev/sdc12
      4     4       0        0        4      faulty removed
      5     5       0        0        5      faulty removed
      6     6       8       12        6      active sync   /dev/sda12
      7     7       8       60        7      spare   /dev/sdd12

/dev/sdd12:
             Checksum : fdcd2068 - correct

            Number   Major   Minor   RaidDevice State
      this     7       8       60        7      spare   /dev/sdd12

0 0 8 92 0 active sync /dev/ sdf12
         1     1       0        0        1      faulty removed
2 2 8 76 2 active sync /dev/ sde12 3 3 8 44 3 active sync /dev/ sdc12
         4     4       0        0        4      faulty removed
         5     5       0        0        5      faulty removed
6 6 8 12 6 active sync /dev/ sda12
         7     7       8       60        7      spare   /dev/sdd12

/dev/sde12:
            Checksum : fdcd2074 - correct


               Number   Major   Minor   RaidDevice State
this 2 8 76 2 active sync / dev/sde12

0 0 8 92 0 active sync / dev/sdf12
            1     1       0        0        1      faulty removed
2 2 8 76 2 active sync / dev/sde12 3 3 8 44 3 active sync / dev/sdc12
            4     4       0        0        4      faulty removed
            5     5       0        0        5      faulty removed
6 6 8 12 6 active sync / dev/sda12
            7     7       8       60        7      spare   /dev/sdd12

/dev/sdf12:
                  Number   Major   Minor   RaidDevice State
this 0 8 92 0 active sync / dev/sdf12

0 0 8 92 0 active sync / dev/sdf12
               1     1       0        0        1      faulty removed
2 2 8 76 2 active sync / dev/sde12 3 3 8 44 3 active sync / dev/sdc12
               4     4       0        0        4      faulty removed
               5     5       0        0        5      faulty removed
6 6 8 12 6 active sync / dev/sda12 7 7 8 60 7 spare /dev/ sdd12


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux