On Tue Oct 08, 2013 at 09:19:44 +0200, Guillaume Betous wrote: > Hi, > > My RAID 5 has failed... After a first failure, the spare disk has > started its rebuild. During the rebuild process (60%), I received a > 2nd email :( > > The RAID became laggy and finally unusable. > > Now I can't recover the RAID array. Even if there is no particular > precious data, I'm trying to recover it, only be to learn a little bit > :) > > I've tried the procedures written in the wiki, and before trying the > last one (recreate), I write this mail, as said in the wiki :) > > Trying to --force fails with the following message : > mdadm: /dev/sdf1 has no superblock - assembly aborted > > Removing sdf1 from RAID array results in same error on sdd1 and so on... > > You'll find some command results here after : > > mdadm --examine : http://pastie.org/8385891 > timeouts : http://pastie.org/8385901 > smartclt -x : http://pastebin.com/BXMHADZD > Looks like you've got some timeout mismatches, which is probably causing some of the issues. Two of the drives are WD Reds, which have SCT ERC enabled by default at 7 seconds (which is good). There's also a Seagate which supports ERC but it isn't enabled (you'll need to set that each boot). Then there's a Seagate and a WD Green which don't support ERC at all, so you'll need to set the timeouts to 180+ at each boot for those. I've no idea which disk is which though - I'd guess the smartctl output is in order, but there's nothing to actually say which output corresponds with which device. The first WD Red (sda?) is reporting a couple of read errors - those are the only obvious SMART errors though. There are a couple of command timeouts in the SMART attributes for the non-green Seagate (sdb?) as well, which might also be relevant (I'm not familiar with that attribute though, so I'm not entirely sure). As far as the array goes, it looks like you _should_ be able to force assembly with sdc1, sde1 & sdf1. They all have array positions listed, whereas the other two are just listed as spares. If that fails, retry with --verbose and post the resulting error messages (and the corresponding section of dmesg output). Make sure you set the ERC/timeouts before attempting to re-add either of the other disks though. HTH, Robin -- ___ ( ' } | Robin Hill <robin@xxxxxxxxxxxxxxx> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" |
Attachment:
signature.asc
Description: Digital signature