Hi Wol, Thanks for your reply. I run BackupPC from a separate host so have good backups, so I'm not sweating too much ! I think you're right about drive 2 being bumped a while ago. That's would make sense with the counts. My bad having no error reporting enabled to alert me. Very disappointed though, these are Samsung drives and only 2 years old. Given I have backups I went for the --force option and am happy to report it all went smoothly. I am not seeing any evidence of rebuild, which is a surprise. | # cat /proc/mdstat | Personalities : [raid6] [raid5] [raid4] | md0 : active raid5 sda1[0] sdd1[3] sdc1[1] | 3906764800 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] | bitmap: 0/15 pages [0KB], 65536KB chunk | unused devices: <none> The raw device was encrypted. No problem with luksOpen Now running xfs_repair on opne of the logical volumes. Looks like I have some data loss but it is minor. Fortunately server has been sitting idle for a couple of weeks due to vacation. What you think about there being no rebuild? Cheers Jonathan On Thu, 2018-08-02 at 19:54 +0100, Wols Lists wrote: > On 02/08/18 10:13, Jonathan Milton wrote: > > Hi, > > > > Overnight my server had problems with its RAID5 (xfs corrupt inodes), on > > reboot the raid comes up inactive. > > > > * Smarttools suggest disks (3x2TB) are healthy. I have powered down and > > checked all the SATA leads are still plugged correctly. > > > > * MDADM is unable to assembled to raid from 1 drive: > > # mdadm --assemble /dev/md0 /dev/sda1 /dev/sdc1 /dev/sdd1 > > mdadm: /dev/md0 assembled from 1 drive - not enough to start the array. > > > > * Event counts are well off on one drive ( 290391/182871/290391) > > Not good. > > > > * SCT Error Recovery Control was disabled on all drives prior to this > > failure but I have since modified the boot scripts to set to 7s as per > > the wiki (no improvement) > > > > I am considering whether to try --force and would like advice from > > experts first > > > > NOT WITHOUT A BACKUP! > > > > Thanks in advance > > > > That "only one drive" bothers me. Have you got any spare drives? Have > you any spare SATA ports to upgrade to raid-6? > > I'd ddrescue the two drives with the highest count (is that sda and > sdd?), then force assemble the copies. That stands a good chance of > succeeding. If that works, you can add back the third drive to recover > your raid-5 - keeping the original two as a temporary backup. > > If you can't get spare drives, overlay the two good drives then see if a > force gets you a working array. If it does, then you can try it without > the overlay, but not having a backup increases the risk ... > > Then add one of the original drives back to convert to raid-6. > > The event counts make me suspect the middle drive got booted long ago > for some reason, then you've had a hiccup that booted a second drive. > Quite likely if you didn't have ERC enabled. So it does look like an > easy fix but because you've effectively got a broken raid-0 at present, > the risk to your data from any further problem is HIGH. Read > > https://raid.wiki.kernel.org/index.php/Linux_Raid#When_Things_Go_Wrogn > > > If you don't have any spare SATA ports, go and buy something like > > https://www.amazon.co.uk/dp/B00952N2DQ/ref=twister_B01DUJJZ8U?_encoding=UTF8&th=1 > > You want a card with one SATA *and* one eSATA - beware - I think most of > these have a jumper to switch between SATA *or* eSATA so you'll want a > card that claims two of each - it will only actually drive two sata > devices so configure one port for SATA for your raid-6, and one for > eSATA so you can temporarily add external disks ... > > https://www.amazon.co.uk/iDsonix-SuperSpeed-Docking-Station-Free-Black/dp/B00L3W0F40/ref=sr_1_1?ie=UTF8&qid=1529780418&sr=8-1&keywords=eSATA%2Bdisk%2Bdocking%2Bstation&th=1 > > Not sure whether you can connect this with an eSATA port-multiplier > cable - do NOT run raid over the USB connection !!! > > Cheers, > Wol -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html