On 20/10/2009 4:41 PM, Scott Marlowe wrote: >> I have a 4 disk Raid10 array running on linux MD raid. >> Sda / sdb / sdc / sdd >> >> One fine day, 2 of the drives just suddenly decide to die on me. (sda and >> sdd) >> >> I've tried multiple methods to try to determine if I can get them back >> online You made an exact image of each drive onto new, spare drives with `dd' or a similar disk imaging tool before trying ANYTHING, right? Otherwise, you may well have made things worse, particularly since you've tried to resync the array. Even if the data was recoverable before, it might not be now. How, exactly, have the drives failed? Are they totally dead, so that the BIOS / disk controller don't even see them? Can the partition tables be read? Does 'file -s /dev/sda' report any output? What's the output of: smartctl -d ata -a /dev/sda (repeat for sdd) ? If the problem is just a few bad sectors, you can usually just force-re-add the drives into the array and then copy the array contents to another drive either at a low level (with dd_rescue) or at a file system level. If the problem is one or more totally fried drives, where the drive is totally inaccessible or most of the data is hopelessly corrupt / unreadable, then you're in a lot more trouble. RAID 10 effectively stripes the data across the mirrored pairs, so if you lose a whole mirrored pair you've lost half the stripes. It's not that different from running paper through a shredder, discarding half the shreds, and lining it all back up. On a side note: I'm personally increasingly annoyed with the tendency of RAID controllers (and s/w raid implementations) to treat disks with unrepairable bad sectors as dead and fail them out of the array. That's OK if you have a hot spare and no other drive fails during rebuild, but it's just not good enough if failing that drive would result in the array going into failed state. Rather than failing a drive and as a result rendering the whole array unreadable in such situations, it should mark the drive defective, set the array to read-only, and start screaming for help. Way too much data gets murdered by RAID implementations removing mildly faulty drives from already-degraded arrays instead of just going read-only. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general