On Sun, 8 Jul 2012 21:05:02 +0200 Dietrich Heise <dh@xxxxxxx> wrote: > Hi, > > the following Problem, > One of four drives has S.M.A.R.T. errors, so I removed it and > replaced, with a new one. > > In the time the drive was rebuilding, one of the three left devices > has an I/O error (sdd1) (sdc1 was the replaced drive an was syncing). > > Now the following happends (two drives are spare drives :( ) It looks like you tried to --add /dev/sdd1 back in after it failed, and mdadm let new. Newer versions of mdadm will refuse as that is not a good thing to do but it shouldn't stop you getting your data back. First thing to realise is that you could have data corruption. There is at least one block in the array which cannot be recovered, possibly more. i.e. any block on sdd1 which is bad, and any block at the same offset in sdc1. These blocks may not be in files which would be lucky, or they may contain important metadata which might mean you've lost lots of files. If you hadn't tried to --add /dev/sdd1 you could just force-assemble the array back to degraded mode (without sdc1) and back up any critical data. As sdd1 now thinks it is a spare you need to re-create the array instead: mdadm -S /dev/md1 mdadm -C /dev/md1 -l5 -n4 -e 1.2 -c 512 /dev/sdf1 /dev/sde1 /dev/sdd1 missing or mdadm -C /dev/md1 -l5 -n4 -e 1.2 -c 512 /dev/sdf1 /dev/sde1 missing /dev/sdd1 depending on whether sdd1 as the 3rd or 4th device in the array - I cannot tell from the output here. You should then be able to mount the array and backup stuff. You then want to use 'ddrescue' to copy sdd1 onto a device with no bad blocks, and assemble the array using the device instead of sdd1. Finally, you can add the new spare (sdc1) to the array and it should rebuild successfully - providing there are no bad blocks on sdf1 or sde1. I hope that makes sense. Do ask if anything is unclear. NeilBrown > > p3 disks # mdadm -D /dev/md1 > /dev/md1: > Version : 1.2 > Creation Time : Mon Feb 28 19:57:56 2011 > Raid Level : raid5 > Used Dev Size : 1465126400 (1397.25 GiB 1500.29 GB) > Raid Devices : 4 > Total Devices : 4 > Persistence : Superblock is persistent > > Update Time : Sun Jul 8 20:37:12 2012 > State : active, FAILED, Not Started > Active Devices : 2 > Working Devices : 4 > Failed Devices : 0 > Spare Devices : 2 > > Layout : left-symmetric > Chunk Size : 512K > > Name : p3:0 (local to host p3) > UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362 > Events : 121205 > > Number Major Minor RaidDevice State > 0 8 81 0 active sync /dev/sdf1 > 1 8 65 1 active sync /dev/sde1 > 2 0 0 2 removed > 3 0 0 3 removed > > 4 8 49 - spare /dev/sdd1 > 5 8 33 - spare /dev/sdc1 > > here is more information: > > p3 disks # mdadm -E /dev/sdc1 > /dev/sdc1: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362 > Name : p3:0 (local to host p3) > Creation Time : Mon Feb 28 19:57:56 2011 > Raid Level : raid5 > Raid Devices : 4 > > Avail Dev Size : 2930275057 (1397.26 GiB 1500.30 GB) > Array Size : 8790758400 (4191.76 GiB 4500.87 GB) > Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : active > Device UUID : caefb029:526187ef:2051f578:db2b82b7 > > Update Time : Sun Jul 8 20:37:12 2012 > Checksum : 18e2bfe1 - correct > Events : 121205 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : spare > Array State : AA.. ('A' == active, '.' == missing) > p3 disks # mdadm -E /dev/sdd1 > /dev/sdd1: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362 > Name : p3:0 (local to host p3) > Creation Time : Mon Feb 28 19:57:56 2011 > Raid Level : raid5 > Raid Devices : 4 > > Avail Dev Size : 2930269954 (1397.26 GiB 1500.30 GB) > Array Size : 8790758400 (4191.76 GiB 4500.87 GB) > Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : active > Device UUID : 4231e244:60e27ed4:eff405d0:2e615493 > > Update Time : Sun Jul 8 20:37:12 2012 > Checksum : 4bec6e25 - correct > Events : 0 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : spare > Array State : AA.. ('A' == active, '.' == missing) > p3 disks # mdadm -E /dev/sde1 > /dev/sde1: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362 > Name : p3:0 (local to host p3) > Creation Time : Mon Feb 28 19:57:56 2011 > Raid Level : raid5 > Raid Devices : 4 > > Avail Dev Size : 2930253889 (1397.25 GiB 1500.29 GB) > Array Size : 8790758400 (4191.76 GiB 4500.87 GB) > Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : active > Device UUID : 28b08f44:4cc24663:84d39337:94c35d67 > > Update Time : Sun Jul 8 20:37:12 2012 > Checksum : 15faa8a1 - correct > Events : 121205 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : Active device 1 > Array State : AA.. ('A' == active, '.' == missing) > p3 disks # mdadm -E /dev/sdf1 > /dev/sdf1: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x0 > Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362 > Name : p3:0 (local to host p3) > Creation Time : Mon Feb 28 19:57:56 2011 > Raid Level : raid5 > Raid Devices : 4 > > Avail Dev Size : 2930269954 (1397.26 GiB 1500.30 GB) > Array Size : 8790758400 (4191.76 GiB 4500.87 GB) > Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB) > Data Offset : 2048 sectors > Super Offset : 8 sectors > State : active > Device UUID : 78d5600a:91927758:f78a1cea:3bfa3f5b > > Update Time : Sun Jul 8 20:37:12 2012 > Checksum : 7767cb10 - correct > Events : 121205 > > Layout : left-symmetric > Chunk Size : 512K > > Device Role : Active device 0 > Array State : AA.. ('A' == active, '.' == missing) > > Is there a way to repair the raid? > > thanks! > Dietrich > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature