Re: RAID 5 re-add of removed drive? (failed drive replacement)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kindly Read the document correctly and throughly.

raidhotadd /dev/mdX /dev/sdb

It says

Q. I have two disk-mirrored array, suppose if one of my disk in
mirrored RAID array fails, then I will replace that disk with new one
(I have hot swapping SCSI drives). Now question is how I rebuild a
RAID array after a disk fails.

A. A redundant array of inexpensive disks, (redundant array of
independent disks) is a system, which uses multiple hard drives to
share or replicate data among the drives. You can use both IDE and
SCSI disk for mirroring.

If you are not using hot swapping drives then you need to shutdown
server. Once hard disk has been replaced to system, you need to use
used raidhotadd to add disks from RAID-1, -4 and -5 arrays, while they
are active.

Assuming that new SCSI disk is /dev/sdb, type the following command:#
raidhotadd /dev/mdX /dev/sdb


On Tue, Jun 2, 2009 at 4:15 PM, Alexander Rietsch
<Alexander.Rietsch@xxxxxxxxxx> wrote:
> Thank you for answering my mail. But to actually read it instead of posting
> a link which contains no more information as already in the RAID FAQ or
> mdadm man page, here is the short version of my problem:
>
>>> disc 0: sdi1 <- is now disc 7 and SPARE
>>> disc 1: sdl1
>>> disc 2: sdh1
>>> disc 3: sdj1
>>> disc 4: sdk1
>>> disc 5: sdg1
>>> disc 6: sda1 <- is now faulty removed
>
> sdb1 <- not finished replacement drive, now SPARE
>
> of the original 7 drives, 2 are disabled. Please tell me how to
> - re-add sdi1 as disc 0 (mdadm --re-add just adds it as spare)
> - how to enable sda1 as disc6 (mdadm --assemble --force --scan refuses to
> acceppt it)
> - how to use the new drive sdb1 as disc7 (mdadm --assemble --force --scan
> just adds it as spare)
>
> original post:
>
> After removing a drive and restoring to the new one, another disc in the
> array failed. Now I still have all the data redundantly available (the old
> drive is still there), but the RAID header is now in a state where it's
> impossible to access the data. Is it possible to rearrange the drives to
> force the kernel to a valid array?
>
> Here is the story:
>
> // my normal boot log showing RAID devices
>
> Jun  1 22:37:45 localhost klogd: md: md0 stopped.
> Jun  1 22:37:45 localhost klogd: md: bind<sdl1>
> Jun  1 22:37:45 localhost klogd: md: bind<sdh1>
> Jun  1 22:37:45 localhost klogd: md: bind<sdj1>
> Jun  1 22:37:45 localhost klogd: md: bind<sdk1>
> Jun  1 22:37:45 localhost klogd: md: bind<sdg1>
> Jun  1 22:37:45 localhost klogd: md: bind<sda1>
> Jun  1 22:37:45 localhost klogd: md: bind<sdi1>
> Jun  1 22:37:45 localhost klogd: xor: automatically using best checksumming
> function: generic_sse
> Jun  1 22:37:45 localhost klogd:    generic_sse:  5144.000 MB/sec
> Jun  1 22:37:45 localhost klogd: xor: using function: generic_sse (5144.000
> MB/sec)
> Jun  1 22:37:45 localhost klogd: async_tx: api initialized (async)
> Jun  1 22:37:45 localhost klogd: raid6: int64x1   1539 MB/s
> Jun  1 22:37:45 localhost klogd: raid6: int64x2   1558 MB/s
> Jun  1 22:37:45 localhost klogd: raid6: int64x4   1968 MB/s
> Jun  1 22:37:45 localhost klogd: raid6: int64x8   1554 MB/s
> Jun  1 22:37:45 localhost klogd: raid6: sse2x1    2441 MB/s
> Jun  1 22:37:45 localhost klogd: raid6: sse2x2    3250 MB/s
> Jun  1 22:37:45 localhost klogd: raid6: sse2x4    3460 MB/s
> Jun  1 22:37:45 localhost klogd: raid6: using algorithm sse2x4 (3460 MB/s)
> Jun  1 22:37:45 localhost klogd: md: raid6 personality registered for level
> 6
> Jun  1 22:37:45 localhost klogd: md: raid5 personality registered for level
> 5
> Jun  1 22:37:45 localhost klogd: md: raid4 personality registered for level
> 4
> Jun  1 22:37:45 localhost klogd: raid5: device sdi1 operational as raid disk
> 0
> Jun  1 22:37:45 localhost klogd: raid5: device sda1 operational as raid disk
> 6
> Jun  1 22:37:45 localhost klogd: raid5: device sdg1 operational as raid disk
> 5
> Jun  1 22:37:45 localhost klogd: raid5: device sdk1 operational as raid disk
> 4
> Jun  1 22:37:45 localhost klogd: raid5: device sdj1 operational as raid disk
> 3
> Jun  1 22:37:45 localhost klogd: raid5: device sdh1 operational as raid disk
> 2
> Jun  1 22:37:45 localhost klogd: raid5: device sdl1 operational as raid disk
> 1
> Jun  1 22:37:45 localhost klogd: raid5: allocated 7434kB for md0
> Jun  1 22:37:45 localhost klogd: raid5: raid level 5 set md0 active with 7
> out of 7 devices, algorithm 2
> Jun  1 22:37:45 localhost klogd: RAID5 conf printout:
> Jun  1 22:37:45 localhost klogd:  --- rd:7 wd:7
> Jun  1 22:37:45 localhost klogd:  disk 0, o:1, dev:sdi1
> Jun  1 22:37:45 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  1 22:37:45 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  1 22:37:45 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  1 22:37:45 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  1 22:37:45 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  1 22:37:45 localhost klogd:  disk 6, o:1, dev:sda1
> Jun  1 22:37:45 localhost klogd: md0: detected capacity change from 0 to
> 6001213046784
> Jun  1 22:37:45 localhost klogd:  md0: unknown partition table
>
> // now a new spare drive is added
>
> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdb1
>
> Jun  1 22:42:00 localhost klogd: md: bind<sdb1>
>
> // and here goes the drive replacement
>
> [root@localhost ~]# mdadm /dev/md0 --fail /dev/sdi1 --remove /dev/sdi1
>
> Jun  1 22:44:10 localhost klogd: raid5: Disk failure on sdi1, disabling
> device.
> Jun  1 22:44:10 localhost klogd: raid5: Operation continuing on 6 devices.
> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
> Jun  1 22:44:10 localhost klogd:  disk 0, o:0, dev:sdi1
> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
> Jun  1 22:44:10 localhost klogd:  disk 0, o:1, dev:sdb1
> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
> Jun  1 22:44:10 localhost klogd: md: recovery of RAID array md0
> Jun  1 22:44:10 localhost klogd: md: unbind<sdi1>
> Jun  1 22:44:10 localhost klogd: md: minimum _guaranteed_  speed: 1000
> KB/sec/disk.
> Jun  1 22:44:10 localhost klogd: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for recovery.
> Jun  1 22:44:10 localhost klogd: md: using 128k window, over a total of
> 976759936 blocks.
> Jun  1 22:44:10 localhost klogd: md: export_rdev(sdi1)
>
> [root@localhost ~]# more /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid5 sdb1[7] sda1[6] sdg1[5] sdk1[4] sdj1[3] sdh1[2] sdl1[1]
>      5860559616 blocks level 5, 64k chunk, algorithm 2 [7/6] [_UUUUUU]
>      [=====>...............]  recovery = 27.5% (269352320/976759936)
> finish=276.2min speed=42686K/sec
>
> // surface error on RAID drive while recovery:
>
> Jun  2 03:58:59 localhost klogd: ata1.00: exception Emask 0x0 SAct 0xffff
> SErr 0x0 action 0x0
> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
> 60/08:58:3f:bd:b8/00:00:6b:00:00/40 tag 11 ncq 4096 in
> Jun  2 03:59:49 localhost klogd:          res
> 41/40:08:3f:bd:b8/8c:00:6b:00:00/00 Emask 0x409 (media error) <F>
> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
> Jun  2 03:59:49 localhost klogd: ata1: EH complete
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
> hardware sectors: (1.50 TB/1.36 TiB)
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
> read cache: enabled, doesn't support DPO or FUA
> Jun  2 03:59:49 localhost klogd: ata1.00: exception Emask 0x0 SAct 0x3ffc
> SErr 0x0 action 0x0
> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
> 60/08:20:3f:bd:b8/00:00:6b:00:00/40 tag 4 ncq 4096 in
> Jun  2 03:59:49 localhost klogd:          res
> 41/40:08:3f:bd:b8/28:00:6b:00:00/00 Emask 0x409 (media error) <F>
> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
> Jun  2 03:59:49 localhost klogd: ata1: EH complete
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
> hardware sectors: (1.50 TB/1.36 TiB)
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
> read cache: enabled, doesn't support DPO or FUA
> ...
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269136 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269144 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269152 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269160 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269168 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269176 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269184 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269192 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269200 on sda1).
> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
> (sector 1807269208 on sda1).
> Jun  2 03:59:49 localhost klogd: ata1: EH complete
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
> hardware sectors: (1.50 TB/1.36 TiB)
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
> read cache: enabled, doesn't support DPO or FUA
> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
> Jun  2 03:59:49 localhost klogd:  disk 0, o:1, dev:sdb1
> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  2 03:59:49 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  2 03:59:49 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  2 03:59:49 localhost klogd:  disk 6, o:0, dev:sda1
> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Currently
> unreadable (pending) sectors
> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Offline
> uncorrectable sectors
>
> // md0 is now down. But hey, still got the old drive, so just add it again:
>
> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdi1
>
> Jun  2 09:11:49 localhost klogd: md: bind<sdi1>
>
> // it's just added as a SPARE! HELP!!! reboot always helps..
>
> [root@localhost ~]# reboot
> [root@localhost log]# mdadm -E /dev/sd[bagkjhli]1
> /dev/sda1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 7
> Preferred Minor : 0
>
>    Update Time : Mon Jun  1 22:44:10 2009
>          State : clean
>  Active Devices : 6
> Working Devices : 7
>  Failed Devices : 0
>  Spare Devices : 1
>       Checksum : 22d364f3 - correct
>         Events : 2599984
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     6       8        1        6      active sync   /dev/sda1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       8        1        6      active sync   /dev/sda1
>   7     7       8       17        7      spare   /dev/sdb1
> /dev/sdb1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 8
> Preferred Minor : 0
>
>    Update Time : Tue Jun  2 09:11:49 2009
>          State : clean
>  Active Devices : 5
> Working Devices : 7
>  Failed Devices : 1
>  Spare Devices : 2
>       Checksum : 22d3f8dd - correct
>         Events : 2599992
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     8       8       17        8      spare   /dev/sdb1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       0        0        6      faulty removed
>   7     7       8      129        7      spare   /dev/sdi1
>   8     8       8       17        8      spare   /dev/sdb1
> /dev/sdg1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 8
> Preferred Minor : 0
>
>    Update Time : Tue Jun  2 09:11:49 2009
>          State : clean
>  Active Devices : 5
> Working Devices : 7
>  Failed Devices : 1
>  Spare Devices : 2
>       Checksum : 22d3f92d - correct
>         Events : 2599992
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     5       8       97        5      active sync   /dev/sdg1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       0        0        6      faulty removed
>   7     7       8      129        7      spare   /dev/sdi1
>   8     8       8       17        8      spare   /dev/sdb1
> /dev/sdh1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 8
> Preferred Minor : 0
>
>    Update Time : Tue Jun  2 09:11:49 2009
>          State : clean
>  Active Devices : 5
> Working Devices : 7
>  Failed Devices : 1
>  Spare Devices : 2
>       Checksum : 22d3f937 - correct
>         Events : 2599992
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     2       8      113        2      active sync   /dev/sdh1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       0        0        6      faulty removed
>   7     7       8      129        7      spare   /dev/sdi1
>   8     8       8       17        8      spare   /dev/sdb1
> /dev/sdi1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 8
> Preferred Minor : 0
>
>    Update Time : Tue Jun  2 09:11:49 2009
>          State : clean
>  Active Devices : 5
> Working Devices : 7
>  Failed Devices : 1
>  Spare Devices : 2
>       Checksum : 22d3f94b - correct
>         Events : 2599992
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     7       8      129        7      spare   /dev/sdi1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       0        0        6      faulty removed
>   7     7       8      129        7      spare   /dev/sdi1
>   8     8       8       17        8      spare   /dev/sdb1
> /dev/sdj1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 8
> Preferred Minor : 0
>
>    Update Time : Tue Jun  2 09:11:49 2009
>          State : clean
>  Active Devices : 5
> Working Devices : 7
>  Failed Devices : 1
>  Spare Devices : 2
>       Checksum : 22d3f959 - correct
>         Events : 2599992
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     3       8      145        3      active sync   /dev/sdj1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       0        0        6      faulty removed
>   7     7       8      129        7      spare   /dev/sdi1
>   8     8       8       17        8      spare   /dev/sdb1
> /dev/sdk1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 8
> Preferred Minor : 0
>
>    Update Time : Tue Jun  2 09:11:49 2009
>          State : clean
>  Active Devices : 5
> Working Devices : 7
>  Failed Devices : 1
>  Spare Devices : 2
>       Checksum : 22d3f96b - correct
>         Events : 2599992
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     4       8      161        4      active sync   /dev/sdk1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       0        0        6      faulty removed
>   7     7       8      129        7      spare   /dev/sdi1
>   8     8       8       17        8      spare   /dev/sdb1
> /dev/sdl1:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>  Creation Time : Sun Nov  2 13:21:54 2008
>     Raid Level : raid5
>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>   Raid Devices : 7
>  Total Devices : 8
> Preferred Minor : 0
>
>    Update Time : Tue Jun  2 09:11:49 2009
>          State : clean
>  Active Devices : 5
> Working Devices : 7
>  Failed Devices : 1
>  Spare Devices : 2
>       Checksum : 22d3f975 - correct
>         Events : 2599992
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     1       8      177        1      active sync   /dev/sdl1
>
>   0     0       0        0        0      removed
>   1     1       8      177        1      active sync   /dev/sdl1
>   2     2       8      113        2      active sync   /dev/sdh1
>   3     3       8      145        3      active sync   /dev/sdj1
>   4     4       8      161        4      active sync   /dev/sdk1
>   5     5       8       97        5      active sync   /dev/sdg1
>   6     6       0        0        6      faulty removed
>   7     7       8      129        7      spare   /dev/sdi1
>   8     8       8       17        8      spare   /dev/sdb1
>
> the old RAID configuration was:
>
> disc 0: sdi1 <- is now disc 7 and SPARE
> disc 1: sdl1
> disc 2: sdh1
> disc 3: sdj1
> disc 4: sdk1
> disc 5: sdg1
> disc 6: sda1 <- is now faulty removed
>
> [root@localhost log]# mdadm --assemble --force /dev/md0 /dev/sd[ilhjkgab]1
> mdadm: /dev/md/0 assembled from 5 drives and 2 spares - not enough to start
> the array.
> [root@localhost log]# cat /proc/mdstat
> Personalities :
> md0 : inactive sdl1[1](S) sdb1[8](S) sdi1[7](S) sda1[6](S) sdg1[5](S)
> sdk1[4](S) sdj1[3](S) sdh1[2](S)
>      8790840960 blocks
>
>
> On large arrays this may happen a lot: A bad drive is first discovered
> during maintenance operations when it's too late. Maybe an option to add a
> redundant drive in a fail-save way would be a good idea to add to md
> sevices.
>
> Please tell me if you see any solution to the problems below.
>
> 1. Is it possible to reassign /dev/sdi1 as disc 0 and access the RAID as is
> was before the restore attempt?
>
> 2. Is it possible to reassign /dev/sda1 as disc 6 and backup the still
> readable data on the RAID?
>
> 3. I guess more then 90% of data was written to /dev/sdb1 in the restore
> attempt. Is it possble to use /dev/sdb1 as disc 7 to access the RAID?
>
> Thank you for looking at the problem
> Alexander
>
>



-- 
-- Sujit K M



On Tue, Jun 2, 2009 at 3:48 PM, Sujit Karataparambil <sjt.kar@xxxxxxxxx> wrote:
> http://www.cyberciti.biz/faq/howto-rebuilding-a-raid-array-after-a-disk-fails/
>
>
> On Tue, Jun 2, 2009 at 3:39 PM, Alex R <Alexander.Rietsch@xxxxxxxxxx> wrote:
>>
>> I have a serious RAID problem here. Please have a look at this. Any help
>> would be greatly appreciated!
>>
>> As always, most problems occur only during critical tasks like
>> enlarging/restoring. I tried to replace a drive in my 7disc 6T RAID5 array
>> as explained here:
>> http://michael-prokop.at/blog/2006/09/09/raid5-online-resizing-with-linux/
>>
>> After removing a drive and restoring to the new one, another disc in the
>> array failed. Now I still have all the data redundantly available (the old
>> drive is still there), but the RAID header is now in a state where it's
>> impossible to access the data. Is it possible to rearrange the drives to
>> force the kernel to a valid array?
>>
>> Here is the story:
>>
>> // my normal boot log showing RAID devices
>>
>> Jun  1 22:37:45 localhost klogd: md: md0 stopped.
>> Jun  1 22:37:45 localhost klogd: md: bind<sdl1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdh1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdj1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdk1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdg1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sda1>
>> Jun  1 22:37:45 localhost klogd: md: bind<sdi1>
>> Jun  1 22:37:45 localhost klogd: xor: automatically using best checksumming
>> function: generic_sse
>> Jun  1 22:37:45 localhost klogd:    generic_sse:  5144.000 MB/sec
>> Jun  1 22:37:45 localhost klogd: xor: using function: generic_sse (5144.000
>> MB/sec)
>> Jun  1 22:37:45 localhost klogd: async_tx: api initialized (async)
>> Jun  1 22:37:45 localhost klogd: raid6: int64x1   1539 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: int64x2   1558 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: int64x4   1968 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: int64x8   1554 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: sse2x1    2441 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: sse2x2    3250 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: sse2x4    3460 MB/s
>> Jun  1 22:37:45 localhost klogd: raid6: using algorithm sse2x4 (3460 MB/s)
>> Jun  1 22:37:45 localhost klogd: md: raid6 personality registered for level
>> 6
>> Jun  1 22:37:45 localhost klogd: md: raid5 personality registered for level
>> 5
>> Jun  1 22:37:45 localhost klogd: md: raid4 personality registered for level
>> 4
>> Jun  1 22:37:45 localhost klogd: raid5: device sdi1 operational as raid disk
>> 0
>> Jun  1 22:37:45 localhost klogd: raid5: device sda1 operational as raid disk
>> 6
>> Jun  1 22:37:45 localhost klogd: raid5: device sdg1 operational as raid disk
>> 5
>> Jun  1 22:37:45 localhost klogd: raid5: device sdk1 operational as raid disk
>> 4
>> Jun  1 22:37:45 localhost klogd: raid5: device sdj1 operational as raid disk
>> 3
>> Jun  1 22:37:45 localhost klogd: raid5: device sdh1 operational as raid disk
>> 2
>> Jun  1 22:37:45 localhost klogd: raid5: device sdl1 operational as raid disk
>> 1
>> Jun  1 22:37:45 localhost klogd: raid5: allocated 7434kB for md0
>> Jun  1 22:37:45 localhost klogd: raid5: raid level 5 set md0 active with 7
>> out of 7 devices, algorithm 2
>> Jun  1 22:37:45 localhost klogd: RAID5 conf printout:
>> Jun  1 22:37:45 localhost klogd:  --- rd:7 wd:7
>> Jun  1 22:37:45 localhost klogd:  disk 0, o:1, dev:sdi1
>> Jun  1 22:37:45 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:37:45 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:37:45 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:37:45 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:37:45 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:37:45 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:37:45 localhost klogd: md0: detected capacity change from 0 to
>> 6001213046784
>> Jun  1 22:37:45 localhost klogd:  md0: unknown partition table
>>
>> // now a new spare drive is added
>>
>> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdb1
>>
>> Jun  1 22:42:00 localhost klogd: md: bind<sdb1>
>>
>> // and here goes the drive replacement
>>
>> [root@localhost ~]# mdadm /dev/md0 --fail /dev/sdi1 --remove /dev/sdi1
>>
>> Jun  1 22:44:10 localhost klogd: raid5: Disk failure on sdi1, disabling
>> device.
>> Jun  1 22:44:10 localhost klogd: raid5: Operation continuing on 6 devices.
>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>> Jun  1 22:44:10 localhost klogd:  disk 0, o:0, dev:sdi1
>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:44:10 localhost klogd: RAID5 conf printout:
>> Jun  1 22:44:10 localhost klogd:  --- rd:7 wd:6
>> Jun  1 22:44:10 localhost klogd:  disk 0, o:1, dev:sdb1
>> Jun  1 22:44:10 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  1 22:44:10 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  1 22:44:10 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  1 22:44:10 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  1 22:44:10 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  1 22:44:10 localhost klogd:  disk 6, o:1, dev:sda1
>> Jun  1 22:44:10 localhost klogd: md: recovery of RAID array md0
>> Jun  1 22:44:10 localhost klogd: md: unbind<sdi1>
>> Jun  1 22:44:10 localhost klogd: md: minimum _guaranteed_  speed: 1000
>> KB/sec/disk.
>> Jun  1 22:44:10 localhost klogd: md: using maximum available idle IO
>> bandwidth (but not more than 200000 KB/sec) for recovery.
>> Jun  1 22:44:10 localhost klogd: md: using 128k window, over a total of
>> 976759936 blocks.
>> Jun  1 22:44:10 localhost klogd: md: export_rdev(sdi1)
>>
>> [root@localhost ~]# more /proc/mdstat
>> Personalities : [raid6] [raid5] [raid4]
>> md0 : active raid5 sdb1[7] sda1[6] sdg1[5] sdk1[4] sdj1[3] sdh1[2] sdl1[1]
>>      5860559616 blocks level 5, 64k chunk, algorithm 2 [7/6] [_UUUUUU]
>>      [=====>...............]  recovery = 27.5% (269352320/976759936)
>> finish=276.2min speed=42686K/sec
>>
>> // surface error on RAID drive while recovery:
>>
>> Jun  2 03:58:59 localhost klogd: ata1.00: exception Emask 0x0 SAct 0xffff
>> SErr 0x0 action 0x0
>> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
>> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
>> 60/08:58:3f:bd:b8/00:00:6b:00:00/40 tag 11 ncq 4096 in
>> Jun  2 03:59:49 localhost klogd:          res
>> 41/40:08:3f:bd:b8/8c:00:6b:00:00/00 Emask 0x409 (media error) <F>
>> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
>> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
>> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>> hardware sectors: (1.50 TB/1.36 TiB)
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> Jun  2 03:59:49 localhost klogd: ata1.00: exception Emask 0x0 SAct 0x3ffc
>> SErr 0x0 action 0x0
>> Jun  2 03:59:49 localhost klogd: ata1.00: irq_stat 0x40000008
>> Jun  2 03:59:49 localhost klogd: ata1.00: cmd
>> 60/08:20:3f:bd:b8/00:00:6b:00:00/40 tag 4 ncq 4096 in
>> Jun  2 03:59:49 localhost klogd:          res
>> 41/40:08:3f:bd:b8/28:00:6b:00:00/00 Emask 0x409 (media error) <F>
>> Jun  2 03:59:49 localhost klogd: ata1.00: status: { DRDY ERR }
>> Jun  2 03:59:49 localhost klogd: ata1.00: error: { UNC }
>> Jun  2 03:59:49 localhost klogd: ata1.00: configured for UDMA/133
>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>> hardware sectors: (1.50 TB/1.36 TiB)
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> ...
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269136 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269144 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269152 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269160 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269168 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269176 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269184 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269192 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269200 on sda1).
>> Jun  2 03:59:49 localhost klogd: raid5:md0: read error not correctable
>> (sector 1807269208 on sda1).
>> Jun  2 03:59:49 localhost klogd: ata1: EH complete
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] 2930277168 512-byte
>> hardware sectors: (1.50 TB/1.36 TiB)
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write Protect is off
>> Jun  2 03:59:49 localhost klogd: sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:49 localhost klogd:  disk 0, o:1, dev:sdb1
>> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:49 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:49 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 03:59:49 localhost klogd:  disk 6, o:0, dev:sda1
>> Jun  2 03:59:49 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:49 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:49 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:49 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:49 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
>> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 03:59:50 localhost klogd:  disk 6, o:0, dev:sda1
>> Jun  2 03:59:50 localhost klogd: RAID5 conf printout:
>> Jun  2 03:59:50 localhost klogd:  --- rd:7 wd:5
>> Jun  2 03:59:50 localhost klogd:  disk 1, o:1, dev:sdl1
>> Jun  2 03:59:50 localhost klogd:  disk 2, o:1, dev:sdh1
>> Jun  2 03:59:50 localhost klogd:  disk 3, o:1, dev:sdj1
>> Jun  2 03:59:50 localhost klogd:  disk 4, o:1, dev:sdk1
>> Jun  2 03:59:50 localhost klogd:  disk 5, o:1, dev:sdg1
>> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Currently
>> unreadable (pending) sectors
>> Jun  2 04:26:17 localhost smartd[2502]: Device: /dev/sda, 34 Offline
>> uncorrectable sectors
>>
>> // md0 is now down. But hey, still got the old drive, so just add it again:
>>
>> [root@localhost ~]# mdadm /dev/md0 --add /dev/sdi1
>>
>> Jun  2 09:11:49 localhost klogd: md: bind<sdi1>
>>
>> // it's just added as a SPARE! HELP!!! reboot always helps..
>>
>> [root@localhost ~]# reboot
>> [root@localhost log]# mdadm -E /dev/sd[bagkjhli]1
>> /dev/sda1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 7
>> Preferred Minor : 0
>>
>>    Update Time : Mon Jun  1 22:44:10 2009
>>          State : clean
>>  Active Devices : 6
>> Working Devices : 7
>>  Failed Devices : 0
>>  Spare Devices : 1
>>       Checksum : 22d364f3 - correct
>>         Events : 2599984
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     6       8        1        6      active sync   /dev/sda1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       8        1        6      active sync   /dev/sda1
>>   7     7       8       17        7      spare   /dev/sdb1
>> /dev/sdb1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f8dd - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     8       8       17        8      spare   /dev/sdb1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdg1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f92d - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     5       8       97        5      active sync   /dev/sdg1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdh1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f937 - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     2       8      113        2      active sync   /dev/sdh1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdi1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f94b - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     7       8      129        7      spare   /dev/sdi1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdj1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f959 - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     3       8      145        3      active sync   /dev/sdj1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdk1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f96b - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     4       8      161        4      active sync   /dev/sdk1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>> /dev/sdl1:
>>          Magic : a92b4efc
>>        Version : 0.90.00
>>           UUID : 15401f4b:391c2538:89022bfa:d48f439f
>>  Creation Time : Sun Nov  2 13:21:54 2008
>>     Raid Level : raid5
>>  Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>>     Array Size : 5860559616 (5589.07 GiB 6001.21 GB)
>>   Raid Devices : 7
>>  Total Devices : 8
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jun  2 09:11:49 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 7
>>  Failed Devices : 1
>>  Spare Devices : 2
>>       Checksum : 22d3f975 - correct
>>         Events : 2599992
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     1       8      177        1      active sync   /dev/sdl1
>>
>>   0     0       0        0        0      removed
>>   1     1       8      177        1      active sync   /dev/sdl1
>>   2     2       8      113        2      active sync   /dev/sdh1
>>   3     3       8      145        3      active sync   /dev/sdj1
>>   4     4       8      161        4      active sync   /dev/sdk1
>>   5     5       8       97        5      active sync   /dev/sdg1
>>   6     6       0        0        6      faulty removed
>>   7     7       8      129        7      spare   /dev/sdi1
>>   8     8       8       17        8      spare   /dev/sdb1
>>
>> the old RAID configuration was:
>>
>> disc 0: sdi1 <- is now disc 7 and SPARE
>> disc 1: sdl1
>> disc 2: sdh1
>> disc 3: sdj1
>> disc 4: sdk1
>> disc 5: sdg1
>> disc 6: sda1 <- is now faulty removed
>>
>> [root@localhost log]# mdadm --assemble --force /dev/md0 /dev/sd[ilhjkgab]1
>> mdadm: /dev/md/0 assembled from 5 drives and 2 spares - not enough to start
>> the array.
>> [root@localhost log]# cat /proc/mdstat
>> Personalities :
>> md0 : inactive sdl1[1](S) sdb1[8](S) sdi1[7](S) sda1[6](S) sdg1[5](S)
>> sdk1[4](S) sdj1[3](S) sdh1[2](S)
>>      8790840960 blocks
>>
>>
>> On large arrays this may happen a lot: A bad drive is first discovered
>> during maintenance operations when it's too late. Maybe an option to add a
>> redundant drive in a fail-save way would be a good idea to add to md
>> sevices.
>>
>> Please tell me if you see any solution to the problems below.
>>
>> 1. Is it possible to reassign /dev/sdi1 as disc 0 and access the RAID as is
>> was before the restore attempt?
>>
>> 2. Is it possible to reassign /dev/sda1 as disc 6 and backup the still
>> readable data on the RAID?
>>
>> 3. I guess more then 90% of data was written to /dev/sdb1 in the restore
>> attempt. Is it possble to use /dev/sdb1 as disc 7 to access the RAID?
>>
>> Thank you for looking at the problem
>> Alexander
>> --
>> View this message in context: http://www.nabble.com/RAID-5-re-add-of-removed-drive--%28failed-drive-replacement%29-tp23828899p23828899.html
>> Sent from the linux-raid mailing list archive at Nabble.com.
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> -- Sujit K M
>



-- 
-- Sujit K M
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux