On Tue, Jan 6, 2015 at 1:34 PM, David Raffelt <david.raffelt@xxxxxxxxxxxxx> wrote: > Hi Brian and Stefan, > Thanks for your reply. I checked the status of the array after the rebuild > (and before the reset). > > md0 : active raid6 sdd1[8] sdc1[4] sda1[3] sdb1[7] sdi1[5] sde1[1] > 14650667520 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] > [UUUUUU_] > > However given that I've never had any problems before with mdadm rebuilds I > did not think to check the data before rebooting. Note that the array is > still in this state. Before the reboot I tried to run a smartctl check on > the failed drives and it could not read them. When I rebooted I did not > actually replace any drives, I just power cycled to see if I could re-access > the drives that were thrown out of the array. According to smartctl they are > completely fine. > > I guess there is no way I can re-add the old drives and remove the newly > synced drive? Even though I immediately kicked all users off the system > when I got the mdadm alert, it's possible a small amount of data was written > to the array during the resync. Well it sounds like there's more than one possibility here. If I follow correctly, you definitely had a working degraded 5/7 drive array, correct? In which case at least it should be possible to get that back, but I don't know what was happening at the time the system hung up on poweroff. It's not rare for smart to not test for certain failure vectors so it might say the drive is fine when it isn't. But what you should do next is mdadm -Evv /dev/sd[abcdefg]1 ##use actual drive letters Are you able to get information on all seven drives? Or do you definitely have at least one drive failed? If the event counter from the above examine is the same for at least 5 drives, you should be able to assemble the array with this command: mdadm --assemble --verbose /dev/mdX /dev/sd[bcdef]1 You have to feed the drive letter designation with the right letters for drives with the same event counter. If that's 5 drives, use that. If it's 6 drives, use that. If the event counters are all off, then it's a matter of what they are so you may just post the event counters so we can see this. This isn't going to write anything to the array, the fs isn't mounted. So if it fails, nothing is worse off. If it works, then you can run xfs_repair -n and see if you get a sane result. If that works you can mount it in this degraded state and maybe extract some of the more important data before proceeding to the next step. In the meantime I'm also curious about: smarctl -l scterc /dev/sdX This has to be issued per drive, no shortcut available by specifying all letters at once in brackets. And then lastly this one: cat /sys/block/sd[abcdefg]/device/timeout Again plug in the correct letters. > Unfortunately this 15TB RAID was part of a 45TB GlusterFS distributed > volume. It was only ever meant to be a scratch drive for intermediate > scientific results, however inevitably most users used it to store lots of > data. Oh well. Right well it's not fore sure toast yet. Also, one of the things gluster is intended to mitigate is the loss of an entire brick, which is what happened, but you need another 15TB of space to do distributed-replicated on your scratch space. If you can tolerate upwards of 48 hour single disk rebuild times, there are now 8TB HGST Helium drives :-P -- Chris Murphy _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs