Re: RAID 5 re-add of removed drive? (failed drive replacement)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



http://www.tldp.org/HOWTO/Software-RAID-HOWTO-3.html

This is the RAID Documentation which I found very less suffiecient.

On Tue, Jun 2, 2009 at 4:22 PM, Sujit Karataparambil <sjt.kar@xxxxxxxxx> wrote:
> Kindly Read the document correctly and throughly.
>
> raidhotadd /dev/mdX /dev/sdb
>
> It says
>
> Q. I have two disk-mirrored array, suppose if one of my disk in
> mirrored RAID array fails, then I will replace that disk with new one
> (I have hot swapping SCSI drives). Now question is how I rebuild a
> RAID array after a disk fails.
>
> A. A redundant array of inexpensive disks, (redundant array of
> independent disks) is a system, which uses multiple hard drives to
> share or replicate data among the drives. You can use both IDE and
> SCSI disk for mirroring.
>
> If you are not using hot swapping drives then you need to shutdown
> server. Once hard disk has been replaced to system, you need to use
> used raidhotadd to add disks from RAID-1, -4 and -5 arrays, while they
> are active.
>
> Assuming that new SCSI disk is /dev/sdb, type the following command:#
> raidhotadd /dev/mdX /dev/sdb
>
>
> On Tue, Jun 2, 2009 at 4:15 PM, Alexander Rietsch
> <Alexander.Rietsch@xxxxxxxxxx> wrote:
>> Thank you for answering my mail. But to actually read it instead of posting
>> a link which contains no more information as already in the RAID FAQ or
>> mdadm man page, here is the short version of my problem:
>>
>>>> disc 0: sdi1 <- is now disc 7 and SPARE
>>>> disc 1: sdl1
>>>> disc 2: sdh1
>>>> disc 3: sdj1
>>>> disc 4: sdk1
>>>> disc 5: sdg1
>>>> disc 6: sda1 <- is now faulty removed
>>
>> sdb1 <- not finished replacement drive, now SPARE
>>
>> of the original 7 drives, 2 are disabled. Please tell me how to
>> - re-add sdi1 as disc 0 (mdadm --re-add just adds it as spare)
>> - how to enable sda1 as disc6 (mdadm --assemble --force --scan refuses to
>> acceppt it)
>> - how to use the new drive sdb1 as disc7 (mdadm --assemble --force --scan
>> just adds it as spare)
>>
>> original post:
>>
>> After removing a drive and restoring to the new one, another disc in the
>> array failed. Now I still have all the data redundantly available (the old
>> drive is still there), but the RAID header is now in a state where it's
>> impossible to access the data. Is it possible to rearrange the drives to
>> force the kernel to a valid array?
>>
>> Here is the story:
>>
>> // my normal boot log showing RAID devices
>>
>> Jun  1 22:37:45 localhost klogd: md: md0 stopped.
>> Jun  1 22:37:45 localhost klogd: md: bind<sdl1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdh1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdj1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdk1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdg1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sda1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdi1>
>> Jun  1 22:37:45 localhost klogd: xor: automatically using best checksumming
>> function: generic_sse
>> Jun  1 22:37:45 localhost klogd:    generic_sse:  5144.000 MB/sec
>> Jun  1 22:37:45 localhost klogd: xor: using function: generic_sse (5144.000
>> MB/sec)
>> Jun  1 22:37:45 localhost klogd: async_tx: api initialized (async)
>> Jun  1 22:37:45 localhost klogd: raid6: int64x1   1539 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: int64x2   1558 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: int64x4   1968 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: int64x8   1554 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: sse2x1    2441 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: sse2x2    3250 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: sse2x4    3460 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: using algorithm sse2x4 (3460 MB/s)
>> Jun  1 22:37:45 localhost klogd: md: raid6 personality registered for level
>> 6
>> Jun  1 22:37:45 localhost klogd: md: raid5 personality registered for level
>> 5
>> Jun  1 22:37:45 localhost klogd: md: raid4 personality registered for level
>> 4
>> Jun  1 22:37:45 localhost klogd: raid5: device sdi1 operational as raid disk
>> 0
>> Jun  1 22:37:45 localhost klogd: raid5: device sda1 operational as raid disk
>> 6
>> Jun  1 22:37:45 localhost klogd: raid5: device sdg1 operational as raid disk
>> 5
>> Jun  1 22:37:45 localhost klogd: raid5: device sdk1 operational as raid disk
>> 4
>> Jun  1 22:37:45 localhost klogd: raid5: device sdj1 operational as raid disk
>> 3
>> Jun  1 22:37:45 localhost klogd: raid5: device sdh1 operational as raid disk
>> 2
>> Jun  1 22:37:45 localhost klogd: raid5: device sdl1 operational as raid disk
>> 1
>> Jun  1 22:37:45 localhost klogd: raid5: allocated 7434kB for md0
>> Jun  1 22:37:45 localhost klogd: raid5: raid level 5 set md0 active with 7
>> out of 7 devices, algorithm 2
>> Jun  1 22:37:45 localhost klogd: RAID5 conf printout:
>> Jun  1 22:37:45 localhost klogd:  --- rd:7 wd:7
>> Jun  1 22:37:45 localhost klogd:  disk 0, o:1, dev:sdi1
>> Jun  1 22:37:45 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:37:45 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:37:45 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:37:45 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:37:45 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:37:45 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:37:45 localhost klogd: md0: detected capacity change from 0 to
>> 6001213046784
>> Jun  1 22:37:45 localhost klogd:  md0: unknown partition table
>>
>> // now a new spare drive is added
>>
>> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdb1
>>
>> Jun  1 22:42:00 localhost klogd: md: bind<sdb1>
>>
>> // and here goes the drive replacement
>>
>> [root@localhost ~]# mdadm /dev/md0 --fail /dev/sdi1 --remove /dev/sdi1
>>
>> Jun  1 22:44:10 localhost klogd: raid5: Disk failure on sdi1, disabling
>> device.
>> Jun  1 22:44:10 localhost klogd: raid5: Operation continuing on 6 devices.
>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>> Jun  1 22:44:10 localhost klogd:  disk 0, o:0, dev:sdi1
>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>> Jun  1 22:44:10 localhost klogd:  disk 0, o:1, dev:sdb1
>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:44:10 localhost klogd: md: recovery of RAID array md0
>> Jun  1 22:44:10 localhost klogd: md: unbind<sdi1>
>> Jun  1 22:44:10 localhost klogd: md: minimum _guaranteed_  speed: 1000
>> KB/sec/disk.
>> Jun  1 22:44:10 localhost klogd: md: using maximum available idle IO
>> bandwidth (but not more than 200000 KB/sec) for recovery.
>> Jun  1 22:44:10 localhost klogd: md: using 128k window, over a total of
>> 976759936 blocks.
>> Jun  1 22:44:10 localhost klogd: md: export_rdev(sdi1)
>>
>> [root@localhost ~]# more /proc/mdstat
>> Personalities : [raid6] [raid5] [raid4]
>> md0 : active raid5 sdb1[7] sda1[6] sdg1[5] sdk1[4] sdj1[3] sdh1[2] sdl1[1]
>>      5860559616 blocks level 5, 64k chunk, algorithm 2 [7/6] [_UUUUUU]
>>      [=====>...............]  recovery = 27.5% (269352320/976759936)
>> finish=276.2min speed=42686K/sec
>>
>> // surface error on RAID drive while recovery:
>>
>> Jun  2 03:58:59 localhost klogd: ata1.00: exception Emask 0x0 SAct 0xffff
>> SErr 0x0 action 0x0
>> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
>> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
>> 60/08:58:3f:bd:b8/00:00:6b:00:00/40 tag 11 ncq 4096 in
>> Jun  2 03:59:49 localhost klogd:          res
>> 41/40:08:3f:bd:b8/8c:00:6b:00:00/00 Emask 0x409 (media error) <F>
>> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
>> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
>> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>> hardware sectors: (1.50 TB/1.36 TiB)
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> Jun  2 03:59:49 localhost klogd: ata1.00: exception Emask 0x0 SAct 0x3ffc
>> SErr 0x0 action 0x0
>> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
>> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
>> 60/08:20:3f:bd:b8/00:00:6b:00:00/40 tag 4 ncq 4096 in
>> Jun  2 03:59:49 localhost klogd:          res
>> 41/40:08:3f:bd:b8/28:00:6b:00:00/00 Emask 0x409 (media error) <F>
>> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
>> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
>> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>> hardware sectors: (1.50 TB/1.36 TiB)
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> ...
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269136 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269144 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269152 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269160 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269168 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269176 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269184 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269192 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269200 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269208 on sda1).
>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>> hardware sectors: (1.50 TB/1.36 TiB)
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:49 localhost klogd:  disk 0, o:1, dev:sdb1
>> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:49 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:49 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 03:59:49 localhost klogd:  disk 6, o:0, dev:sda1
>> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
>> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
>> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Currently
>> unreadable (pending) sectors
>> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Offline
>> uncorrectable sectors
>>
>> // md0 is now down. But hey, still got the old drive, so just add it again:
>>
>> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdi1
>>
>> Jun  2 09:11:49 localhost klogd: md: bind<sdi1>
>>
>> // it's just added as a SPARE! HELP!!! reboot always helps..
>>
>> [root@localhost ~]# reboot
>> [root@localhost log]# mdadm -E /dev/sd[bagkjhli]1
>> /dev/sda1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 7
>> Preferred Minor : 0
>>
>>    Update Time : Mon Jun  1 22:44:10 2009
>>          State : clean
>>  Active Devices : 6
>> Working Devices : 7
>>  Failed Devices : 0
>>  Spare Devices : 1
>>       Checksum : 22d364f3 - correct
>>         Events : 2599984
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     6       8        1        6      active sync   /dev/sda1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       8        1        6      active sync   /dev/sda1
>>   7     7       8       17        7      spare   /dev/sdb1
>> /dev/sdb1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f8dd - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     8       8       17        8      spare   /dev/sdb1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdg1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f92d - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     5       8       97        5      active sync   /dev/sdg1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdh1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f937 - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     2       8      113        2      active sync   /dev/sdh1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdi1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f94b - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     7       8      129        7      spare   /dev/sdi1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdj1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f959 - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     3       8      145        3      active sync   /dev/sdj1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdk1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f96b - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     4       8      161        4      active sync   /dev/sdk1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdl1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f975 - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     1       8      177        1      active sync   /dev/sdl1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>>
>> the old RAID configuration was:
>>
>> disc 0: sdi1 <- is now disc 7 and SPARE
>> disc 1: sdl1
>> disc 2: sdh1
>> disc 3: sdj1
>> disc 4: sdk1
>> disc 5: sdg1
>> disc 6: sda1 <- is now faulty removed
>>
>> [root@localhost log]# mdadm --assemble --force /dev/md0 /dev/sd[ilhjkgab]1
>> mdadm: /dev/md/0 assembled from 5 drives and 2 spares - not enough to start
>> the array.
>> [root@localhost log]# cat /proc/mdstat
>> Personalities :
>> md0 : inactive sdl1[1](S) sdb1[8](S) sdi1[7](S) sda1[6](S) sdg1[5](S)
>> sdk1[4](S) sdj1[3](S) sdh1[2](S)
>>      8790840960 blocks
>>
>>
>> On large arrays this may happen a lot: A bad drive is first discovered
>> during maintenance operations when it's too late. Maybe an option to add a
>> redundant drive in a fail-save way would be a good idea to add to md
>> sevices.
>>
>> Please tell me if you see any solution to the problems below.
>>
>> 1. Is it possible to reassign /dev/sdi1 as disc 0 and access the RAID as is
>> was before the restore attempt?
>>
>> 2. Is it possible to reassign /dev/sda1 as disc 6 and backup the still
>> readable data on the RAID?
>>
>> 3. I guess more then 90% of data was written to /dev/sdb1 in the restore
>> attempt. Is it possble to use /dev/sdb1 as disc 7 to access the RAID?
>>
>> Thank you for looking at the problem
>> Alexander
>>
>>
>
>
>
> --
> -- Sujit K M
>
>
>
> On Tue, Jun 2, 2009 at 3:48 PM, Sujit Karataparambil <sjt.kar@xxxxxxxxx> wrote:
>> http://www.cyberciti.biz/faq/howto-rebuilding-a-raid-array-after-a-disk-fails/
>>
>>
>> On Tue, Jun 2, 2009 at 3:39 PM, Alex R <Alexander.Rietsch@xxxxxxxxxx> wrote:
>>>
>>> I have a serious RAID problem here. Please have a look at this. Any help
>>> would be greatly appreciated!
>>>
>>> As always, most problems occur only during critical tasks like
>>> enlarging/restoring. I tried to replace a drive in my 7disc 6T RAID5 array
>>> as explained here:
>>> http://michael-prokop.at/blog/2006/09/09/raid5-online-resizing-with-linux/
>>>
>>> After removing a drive and restoring to the new one, another disc in the
>>> array failed. Now I still have all the data redundantly available (the old
>>> drive is still there), but the RAID header is now in a state where it's
>>> impossible to access the data. Is it possible to rearrange the drives to
>>> force the kernel to a valid array?
>>>
>>> Here is the story:
>>>
>>> // my normal boot log showing RAID devices
>>>
>>> Jun  1 22:37:45 localhost klogd: md: md0 stopped.
>>> Jun  1 22:37:45 localhost klogd: md: bind<sdl1>
>>> Jun  1 22:37:45 localhost klogd: md: bind<sdh1>
>>> Jun  1 22:37:45 localhost klogd: md: bind<sdj1>
>>> Jun  1 22:37:45 localhost klogd: md: bind<sdk1>
>>> Jun  1 22:37:45 localhost klogd: md: bind<sdg1>
>>> Jun  1 22:37:45 localhost klogd: md: bind<sda1>
>>> Jun  1 22:37:45 localhost klogd: md: bind<sdi1>
>>> Jun  1 22:37:45 localhost klogd: xor: automatically using best checksumming
>>> function: generic_sse
>>> Jun  1 22:37:45 localhost klogd:    generic_sse:  5144.000 MB/sec
>>> Jun  1 22:37:45 localhost klogd: xor: using function: generic_sse (5144.000
>>> MB/sec)
>>> Jun  1 22:37:45 localhost klogd: async_tx: api initialized (async)
>>> Jun  1 22:37:45 localhost klogd: raid6: int64x1   1539 MB/s
>>> Jun  1 22:37:45 localhost klogd: raid6: int64x2   1558 MB/s
>>> Jun  1 22:37:45 localhost klogd: raid6: int64x4   1968 MB/s
>>> Jun  1 22:37:45 localhost klogd: raid6: int64x8   1554 MB/s
>>> Jun  1 22:37:45 localhost klogd: raid6: sse2x1    2441 MB/s
>>> Jun  1 22:37:45 localhost klogd: raid6: sse2x2    3250 MB/s
>>> Jun  1 22:37:45 localhost klogd: raid6: sse2x4    3460 MB/s
>>> Jun  1 22:37:45 localhost klogd: raid6: using algorithm sse2x4 (3460 MB/s)
>>> Jun  1 22:37:45 localhost klogd: md: raid6 personality registered for level
>>> 6
>>> Jun  1 22:37:45 localhost klogd: md: raid5 personality registered for level
>>> 5
>>> Jun  1 22:37:45 localhost klogd: md: raid4 personality registered for level
>>> 4
>>> Jun  1 22:37:45 localhost klogd: raid5: device sdi1 operational as raid disk
>>> 0
>>> Jun  1 22:37:45 localhost klogd: raid5: device sda1 operational as raid disk
>>> 6
>>> Jun  1 22:37:45 localhost klogd: raid5: device sdg1 operational as raid disk
>>> 5
>>> Jun  1 22:37:45 localhost klogd: raid5: device sdk1 operational as raid disk
>>> 4
>>> Jun  1 22:37:45 localhost klogd: raid5: device sdj1 operational as raid disk
>>> 3
>>> Jun  1 22:37:45 localhost klogd: raid5: device sdh1 operational as raid disk
>>> 2
>>> Jun  1 22:37:45 localhost klogd: raid5: device sdl1 operational as raid disk
>>> 1
>>> Jun  1 22:37:45 localhost klogd: raid5: allocated 7434kB for md0
>>> Jun  1 22:37:45 localhost klogd: raid5: raid level 5 set md0 active with 7
>>> out of 7 devices, algorithm 2
>>> Jun  1 22:37:45 localhost klogd: RAID5 conf printout:
>>> Jun  1 22:37:45 localhost klogd:  --- rd:7 wd:7
>>> Jun  1 22:37:45 localhost klogd:  disk 0, o:1, dev:sdi1
>>> Jun  1 22:37:45 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  1 22:37:45 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  1 22:37:45 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  1 22:37:45 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  1 22:37:45 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  1 22:37:45 localhost klogd:  disk 6, o:1, dev:sda1
>>> Jun  1 22:37:45 localhost klogd: md0: detected capacity change from 0 to
>>> 6001213046784
>>> Jun  1 22:37:45 localhost klogd:  md0: unknown partition table
>>>
>>> // now a new spare drive is added
>>>
>>> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdb1
>>>
>>> Jun  1 22:42:00 localhost klogd: md: bind<sdb1>
>>>
>>> // and here goes the drive replacement
>>>
>>> [root@localhost ~]# mdadm /dev/md0 --fail /dev/sdi1 --remove /dev/sdi1
>>>
>>> Jun  1 22:44:10 localhost klogd: raid5: Disk failure on sdi1, disabling
>>> device.
>>> Jun  1 22:44:10 localhost klogd: raid5: Operation continuing on 6 devices.
>>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>>> Jun  1 22:44:10 localhost klogd:  disk 0, o:0, dev:sdi1
>>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>>> Jun  1 22:44:10 localhost klogd:  disk 0, o:1, dev:sdb1
>>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>>> Jun  1 22:44:10 localhost klogd: md: recovery of RAID array md0
>>> Jun  1 22:44:10 localhost klogd: md: unbind<sdi1>
>>> Jun  1 22:44:10 localhost klogd: md: minimum _guaranteed_  speed: 1000
>>> KB/sec/disk.
>>> Jun  1 22:44:10 localhost klogd: md: using maximum available idle IO
>>> bandwidth (but not more than 200000 KB/sec) for recovery.
>>> Jun  1 22:44:10 localhost klogd: md: using 128k window, over a total of
>>> 976759936 blocks.
>>> Jun  1 22:44:10 localhost klogd: md: export_rdev(sdi1)
>>>
>>> [root@localhost ~]# more /proc/mdstat
>>> Personalities : [raid6] [raid5] [raid4]
>>> md0 : active raid5 sdb1[7] sda1[6] sdg1[5] sdk1[4] sdj1[3] sdh1[2] sdl1[1]
>>>      5860559616 blocks level 5, 64k chunk, algorithm 2 [7/6] [_UUUUUU]
>>>      [=====>...............]  recovery = 27.5% (269352320/976759936)
>>> finish=276.2min speed=42686K/sec
>>>
>>> // surface error on RAID drive while recovery:
>>>
>>> Jun  2 03:58:59 localhost klogd: ata1.00: exception Emask 0x0 SAct 0xffff
>>> SErr 0x0 action 0x0
>>> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
>>> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
>>> 60/08:58:3f:bd:b8/00:00:6b:00:00/40 tag 11 ncq 4096 in
>>> Jun  2 03:59:49 localhost klogd:          res
>>> 41/40:08:3f:bd:b8/8c:00:6b:00:00/00 Emask 0x409 (media error) <F>
>>> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
>>> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
>>> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
>>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>>> hardware sectors: (1.50 TB/1.36 TiB)
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>>> read cache: enabled, doesn't support DPO or FUA
>>> Jun  2 03:59:49 localhost klogd: ata1.00: exception Emask 0x0 SAct 0x3ffc
>>> SErr 0x0 action 0x0
>>> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
>>> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
>>> 60/08:20:3f:bd:b8/00:00:6b:00:00/40 tag 4 ncq 4096 in
>>> Jun  2 03:59:49 localhost klogd:          res
>>> 41/40:08:3f:bd:b8/28:00:6b:00:00/00 Emask 0x409 (media error) <F>
>>> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
>>> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
>>> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
>>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>>> hardware sectors: (1.50 TB/1.36 TiB)
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>>> read cache: enabled, doesn't support DPO or FUA
>>> ...
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269136 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269144 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269152 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269160 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269168 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269176 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269184 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269192 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269200 on sda1).
>>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>>> (sector 1807269208 on sda1).
>>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>>> hardware sectors: (1.50 TB/1.36 TiB)
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>>> read cache: enabled, doesn't support DPO or FUA
>>> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
>>> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
>>> Jun  2 03:59:49 localhost klogd:  disk 0, o:1, dev:sdb1
>>> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  2 03:59:49 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  2 03:59:49 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  2 03:59:49 localhost klogd:  disk 6, o:0, dev:sda1
>>> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
>>> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
>>> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
>>> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
>>> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
>>> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
>>> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
>>> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
>>> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
>>> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
>>> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
>>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>>> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Currently
>>> unreadable (pending) sectors
>>> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Offline
>>> uncorrectable sectors
>>>
>>> // md0 is now down. But hey, still got the old drive, so just add it again:
>>>
>>> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdi1
>>>
>>> Jun  2 09:11:49 localhost klogd: md: bind<sdi1>
>>>
>>> // it's just added as a SPARE! HELP!!! reboot always helps..
>>>
>>> [root@localhost ~]# reboot
>>> [root@localhost log]# mdadm -E /dev/sd[bagkjhli]1
>>> /dev/sda1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 7
>>> Preferred Minor : 0
>>>
>>>    Update Time : Mon Jun  1 22:44:10 2009
>>>          State : clean
>>>  Active Devices : 6
>>> Working Devices : 7
>>>  Failed Devices : 0
>>>  Spare Devices : 1
>>>       Checksum : 22d364f3 - correct
>>>         Events : 2599984
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     6       8        1        6      active sync   /dev/sda1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       8        1        6      active sync   /dev/sda1
>>>   7     7       8       17        7      spare   /dev/sdb1
>>> /dev/sdb1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 8
>>> Preferred Minor : 0
>>>
>>>    Update Time : Tue Jun  2 09:11:49 2009
>>>          State : clean
>>>  Active Devices : 5
>>> Working Devices : 7
>>>  Failed Devices : 1
>>>  Spare Devices : 2
>>>       Checksum : 22d3f8dd - correct
>>>         Events : 2599992
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     8       8       17        8      spare   /dev/sdb1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       0        0        6      faulty removed
>>>   7     7       8      129        7      spare   /dev/sdi1
>>>   8     8       8       17        8      spare   /dev/sdb1
>>> /dev/sdg1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 8
>>> Preferred Minor : 0
>>>
>>>    Update Time : Tue Jun  2 09:11:49 2009
>>>          State : clean
>>>  Active Devices : 5
>>> Working Devices : 7
>>>  Failed Devices : 1
>>>  Spare Devices : 2
>>>       Checksum : 22d3f92d - correct
>>>         Events : 2599992
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     5       8       97        5      active sync   /dev/sdg1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       0        0        6      faulty removed
>>>   7     7       8      129        7      spare   /dev/sdi1
>>>   8     8       8       17        8      spare   /dev/sdb1
>>> /dev/sdh1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 8
>>> Preferred Minor : 0
>>>
>>>    Update Time : Tue Jun  2 09:11:49 2009
>>>          State : clean
>>>  Active Devices : 5
>>> Working Devices : 7
>>>  Failed Devices : 1
>>>  Spare Devices : 2
>>>       Checksum : 22d3f937 - correct
>>>         Events : 2599992
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     2       8      113        2      active sync   /dev/sdh1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       0        0        6      faulty removed
>>>   7     7       8      129        7      spare   /dev/sdi1
>>>   8     8       8       17        8      spare   /dev/sdb1
>>> /dev/sdi1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 8
>>> Preferred Minor : 0
>>>
>>>    Update Time : Tue Jun  2 09:11:49 2009
>>>          State : clean
>>>  Active Devices : 5
>>> Working Devices : 7
>>>  Failed Devices : 1
>>>  Spare Devices : 2
>>>       Checksum : 22d3f94b - correct
>>>         Events : 2599992
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     7       8      129        7      spare   /dev/sdi1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       0        0        6      faulty removed
>>>   7     7       8      129        7      spare   /dev/sdi1
>>>   8     8       8       17        8      spare   /dev/sdb1
>>> /dev/sdj1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 8
>>> Preferred Minor : 0
>>>
>>>    Update Time : Tue Jun  2 09:11:49 2009
>>>          State : clean
>>>  Active Devices : 5
>>> Working Devices : 7
>>>  Failed Devices : 1
>>>  Spare Devices : 2
>>>       Checksum : 22d3f959 - correct
>>>         Events : 2599992
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     3       8      145        3      active sync   /dev/sdj1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       0        0        6      faulty removed
>>>   7     7       8      129        7      spare   /dev/sdi1
>>>   8     8       8       17        8      spare   /dev/sdb1
>>> /dev/sdk1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 8
>>> Preferred Minor : 0
>>>
>>>    Update Time : Tue Jun  2 09:11:49 2009
>>>          State : clean
>>>  Active Devices : 5
>>> Working Devices : 7
>>>  Failed Devices : 1
>>>  Spare Devices : 2
>>>       Checksum : 22d3f96b - correct
>>>         Events : 2599992
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     4       8      161        4      active sync   /dev/sdk1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       0        0        6      faulty removed
>>>   7     7       8      129        7      spare   /dev/sdi1
>>>   8     8       8       17        8      spare   /dev/sdb1
>>> /dev/sdl1:
>>>          Magic : a92b4efc
>>>        Version : 0.90.00
>>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>>  Creation Time : Sun Nov  2 13:21:54 2008
>>>     Raid Level : raid5
>>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>>   Raid Devices : 7
>>>  Total Devices : 8
>>> Preferred Minor : 0
>>>
>>>    Update Time : Tue Jun  2 09:11:49 2009
>>>          State : clean
>>>  Active Devices : 5
>>> Working Devices : 7
>>>  Failed Devices : 1
>>>  Spare Devices : 2
>>>       Checksum : 22d3f975 - correct
>>>         Events : 2599992
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>      Number   Major   Minor   RaidDevice State
>>> this     1       8      177        1      active sync   /dev/sdl1
>>>
>>>   0     0       0        0        0      removed
>>>   1     1       8      177        1      active sync   /dev/sdl1
>>>   2     2       8      113        2      active sync   /dev/sdh1
>>>   3     3       8      145        3      active sync   /dev/sdj1
>>>   4     4       8      161        4      active sync   /dev/sdk1
>>>   5     5       8       97        5      active sync   /dev/sdg1
>>>   6     6       0        0        6      faulty removed
>>>   7     7       8      129        7      spare   /dev/sdi1
>>>   8     8       8       17        8      spare   /dev/sdb1
>>>
>>> the old RAID configuration was:
>>>
>>> disc 0: sdi1 <- is now disc 7 and SPARE
>>> disc 1: sdl1
>>> disc 2: sdh1
>>> disc 3: sdj1
>>> disc 4: sdk1
>>> disc 5: sdg1
>>> disc 6: sda1 <- is now faulty removed
>>>
>>> [root@localhost log]# mdadm --assemble --force /dev/md0 /dev/sd[ilhjkgab]1
>>> mdadm: /dev/md/0 assembled from 5 drives and 2 spares - not enough to start
>>> the array.
>>> [root@localhost log]# cat /proc/mdstat
>>> Personalities :
>>> md0 : inactive sdl1[1](S) sdb1[8](S) sdi1[7](S) sda1[6](S) sdg1[5](S)
>>> sdk1[4](S) sdj1[3](S) sdh1[2](S)
>>>      8790840960 blocks
>>>
>>>
>>> On large arrays this may happen a lot: A bad drive is first discovered
>>> during maintenance operations when it's too late. Maybe an option to add a
>>> redundant drive in a fail-save way would be a good idea to add to md
>>> sevices.
>>>
>>> Please tell me if you see any solution to the problems below.
>>>
>>> 1. Is it possible to reassign /dev/sdi1 as disc 0 and access the RAID as is
>>> was before the restore attempt?
>>>
>>> 2. Is it possible to reassign /dev/sda1 as disc 6 and backup the still
>>> readable data on the RAID?
>>>
>>> 3. I guess more then 90% of data was written to /dev/sdb1 in the restore
>>> attempt. Is it possble to use /dev/sdb1 as disc 7 to access the RAID?
>>>
>>> Thank you for looking at the problem
>>> Alexander
>>> --
>>> View this message in context: http://www.nabble.com/RAID-5-re-add-of-removed-drive--%28failed-drive-replacement%29-tp23828899p23828899.html
>>> Sent from the linux-raid mailing list archive at Nabble.com.
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>>
>> --
>> -- Sujit K M
>>
>
>
>
> --
> -- Sujit K M
>



-- 
-- Sujit K M
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux