On Thu, 21 Apr 2011 20:32:57 -0600 John Valarti <mdadmuser@xxxxxxxxx> wrote: > On Thu, Apr 21, 2011 at 1:59 PM, David Brown <david.brown@xxxxxxxxxxxx> wrote: > . > > My first thought would be to get /all/ the disks, not just the "failed" > > ones, out of the machine. You want to make full images of them (with > > ddrescue or something similar) to files on another disk, and then work with > > those images. .. > > Once you've got some (hopefully most) of your data recovered from the > > images, buy four /new/ disks to put in the machine, and work on your > > restore. You don't want to reuse the failing disks, and probably the other > > two equally old and worn disks will be high risk too. > > OK, I think I understand. > Does that mean I need to buy 8 disks, all the same size or bigger? > The originals are 250GB SATA so that should be OK, I guess. > > I read some more and found out I should run mdadm --examine. > > Should I not be able to just add the one disk partition sdc2 back to the RAID? Possibly. It looks like sdb2 failed in October 2009 !!!! and nobody noticed. So your array has been running degraded since then. If you mdadm -A /dev/md1 --force /dev/sd[acd]2 Then you will have your array back, though there could be a small amount of data corruption if the array was in the middle of writing when the system crashed/died/lost-power/whatever-happened. This will give you access to your data. How much you trust your drives to continue to give access to your data is up to you. But you would be wise to at least by a 1TB drive to copy all the data on to before you put too much stress on your old drives. Once you have a safe copy, you could mdadm /dev/md1 --add /dev/sdb2 This will add sdb2 to the array and it will recovery the data for sdb2 from the data and parity on the other drives. If this works - great. However there is a reasonable chance you will hit a read error in which case the recovery will abort and you will still have your data on the degraded array. You could possibly run some bad-blocks test on each drive (which will be destructive - but you have a backup on the 1TB drive) and decide if you want to throw them out or keep using them. What ever you do, once you have a work array again what you feel happy to trust, make sure a 'check' run happens regularly. Some distros provide a cron job to do this for you. It involves simply echo check > /sys/block/md0/md/sync_action This will read every block on every device to make sure there are no sleeping bad blocks. Every month is probably a reasonable frequency to run it. Also run "mdadm --monitor" configured to send you email if there is a drive failure. Also run "mdadm --monitor --oneshot" from a cron tab every day so that if you have a degraded array it will nag you about it every day. Good luck, NeilBrown > > > Here is the result of --examine > > /dev/sda2: > Magic : a92b4efc > Version : 0.90.00 > UUID : ddf4d448:36afa319:f0917855:03f8bbe8 > Creation Time : Mon May 15 16:38:05 2006 > Raid Level : raid5 > Used Dev Size : 244975104 (233.63 GiB 250.85 GB) > Array Size : 734925312 (700.88 GiB 752.56 GB) > Raid Devices : 4 > Total Devices : 3 > Preferred Minor : 1 > > Update Time : Mon Apr 18 07:48:54 2011 > State : clean > Active Devices : 3 > Working Devices : 3 > Failed Devices : 1 > Spare Devices : 0 > Checksum : 5674ce60 - correct > Events : 28580020 > > Layout : left-symmetric > Chunk Size : 256K > > Number Major Minor RaidDevice State > this 1 8 18 1 active sync /dev/sdb2 > > 0 0 8 2 0 active sync /dev/sda2 > 1 1 8 18 1 active sync /dev/sdb2 > 2 2 8 34 2 active sync /dev/sdc2 > 3 3 0 0 3 faulty removed > /dev/sdb2: > Magic : a92b4efc > Version : 0.90.00 > UUID : ddf4d448:36afa319:f0917855:03f8bbe8 > Creation Time : Mon May 15 16:38:05 2006 > Raid Level : raid5 > Used Dev Size : 244975104 (233.63 GiB 250.85 GB) > Array Size : 734925312 (700.88 GiB 752.56 GB) > Raid Devices : 4 > Total Devices : 4 > Preferred Minor : 1 > > Update Time : Sun Oct 18 10:04:06 2009 > State : active > Active Devices : 4 > Working Devices : 4 > Failed Devices : 0 > Spare Devices : 0 > Checksum : 5171dcb2 - correct > Events : 20333614 > > Layout : left-symmetric > Chunk Size : 256K > > Number Major Minor RaidDevice State > this 3 8 50 3 active sync /dev/sdd2 > > 0 0 8 2 0 active sync /dev/sda2 > 1 1 8 18 1 active sync /dev/sdb2 > 2 2 8 34 2 active sync /dev/sdc2 > 3 3 8 50 3 active sync /dev/sdd2 > /dev/sdc2: > Magic : a92b4efc > Version : 0.90.00 > UUID : ddf4d448:36afa319:f0917855:03f8bbe8 > Creation Time : Mon May 15 16:38:05 2006 > Raid Level : raid5 > Used Dev Size : 244975104 (233.63 GiB 250.85 GB) > Array Size : 734925312 (700.88 GiB 752.56 GB) > Raid Devices : 4 > Total Devices : 3 > Preferred Minor : 1 > > Update Time : Mon Apr 18 07:48:51 2011 > State : clean > Active Devices : 3 > Working Devices : 3 > Failed Devices : 1 > Spare Devices : 0 > Checksum : 5674ce6b - correct > Events : 28580018 > > Layout : left-symmetric > Chunk Size : 256K > > Number Major Minor RaidDevice State > this 2 8 34 2 active sync /dev/sdc2 > > 0 0 8 2 0 active sync /dev/sda2 > 1 1 8 18 1 active sync /dev/sdb2 > 2 2 8 34 2 active sync /dev/sdc2 > 3 3 0 0 3 faulty removed > /dev/sdd2: > Magic : a92b4efc > Version : 0.90.00 > UUID : ddf4d448:36afa319:f0917855:03f8bbe8 > Creation Time : Mon May 15 16:38:05 2006 > Raid Level : raid5 > Used Dev Size : 244975104 (233.63 GiB 250.85 GB) > Array Size : 734925312 (700.88 GiB 752.56 GB) > Raid Devices : 4 > Total Devices : 3 > Preferred Minor : 1 > > Update Time : Mon Apr 18 07:48:54 2011 > State : clean > Active Devices : 3 > Working Devices : 3 > Failed Devices : 1 > Spare Devices : 0 > Checksum : 5674ce4e - correct > Events : 28580020 > > Layout : left-symmetric > Chunk Size : 256K > > Number Major Minor RaidDevice State > this 0 8 2 0 active sync /dev/sda2 > > 0 0 8 2 0 active sync /dev/sda2 > 1 1 8 18 1 active sync /dev/sdb2 > 2 2 8 34 2 active sync /dev/sdc2 > 3 3 0 0 3 faulty removed > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html