On 10/15/19 5:04 PM, Wol's lists wrote: > On 15/10/2019 23:44, Curtis Vaughan wrote: >> >>>> >>>> Device info: >>>> ST1000DM003-9YN162, S/N:Z1D17B24, WWN:5-000c50-050e6c90f, FW:CC4C, >>>> 1.00 TB >>> Urkk >>> >>> Seagate Barracudas are NOT recommended. Can you do a "smartctl -x" and >>> see if SCT/ERC is supported? I haven't got a datasheet for the 1GB >>> version, but I've got the 3GB version and it doesn't support it. That >>> means you WILL suffer from the timeout problem ... >>> >>> (Not that that's your problem here, but there's no point tempting fate. >>> I know Seagate say "suitable for desktop raid", but the experts on this >>> list wouldn't agree ...) >> >> SCT supported, but SCT/ERC not. GREAT! Hm and the replacement is also >> a Seagate. > > My new drives are Seagate Ironwolf, which are supposedly fine. I still > haven't managed to boot the system - it's been sat for ages with an > assembly problem I haven't solved - I hope it's something as simple as > needs a bios update, but I can't do that ... >> However another of my servers also has Seagates like the one I'm >> buying and it >> that ERC is supported. So maybe I should buy one more such drive and >> also >> replace sdb? > > Depends. If you run the script on the timeout problem page it "fixes" > the problem. The only downside is that if you have a disk error, > you've just set your timeout to three minutes, so the system could > freeze for near enough that time. Not nice for the user, but at least > the system will be okay. A proper ERC drive can be set to return with > an error very quickly - the default is 7 secs. >> >> Here are the results of the command on the problem drive: >> >> smartctl -x /dev/sda | grep SCT >> SCT capabilities: (0x3085) SCT Status supported. >> 0xe0 GPL,SL R/W 1 SCT Command/Status >> 0xe1 GPL,SL R/W 1 SCT Data Transfer >> SCT Status Version: 3 >> SCT Version (vendor specific): 522 (0x020a) >> SCT Support Level: 1 >> SCT Data Table command not supported >> SCT Error Recovery Control command not supported >> > Typical Barracuda :-( Think I got it working, just want to make sure I did this right. Using fdisk I recreated the exact same partitions on sda as on sdb. Then I ran the mdadm --re-add for each partition to each raid volume. So now here are some outputs to various commands. Does everything look right? cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sda1[2] sdb1[1] 7811008 blocks [2/1] [_U] resync=DELAYED md1 : active raid1 sda2[2] sdb2[1] 968949696 blocks [2/1] [_U] [>....................] recovery = 0.4% (4015552/968949696) finish=184.6min speed=87083K/sec unused devices: <none> mdadm --detail /dev/md0 /dev/md0: Version : 0.90 Creation Time : Wed Jul 18 15:00:44 2012 Raid Level : raid1 Array Size : 7811008 (7.45 GiB 8.00 GB) Used Dev Size : 7811008 (7.45 GiB 8.00 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Oct 16 14:10:46 2019 State : clean, degraded, resyncing (DELAYED) Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Consistency Policy : resync UUID : 7414ac79:580af0ce:e6bbe02b:915fa44a Events : 0.1081 Number Major Minor RaidDevice State 2 8 1 0 spare rebuilding /dev/sda1 1 8 17 1 active sync /dev/sdb1 mdadm --detail /dev/md1 /dev/md1: Version : 0.90 Creation Time : Wed Jul 18 15:00:53 2012 Raid Level : raid1 Array Size : 968949696 (924.06 GiB 992.20 GB) Used Dev Size : 968949696 (924.06 GiB 992.20 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Wed Oct 16 14:12:20 2019 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Consistency Policy : resync Rebuild Status : 1% complete UUID : ac37ca92:939d7053:3b802bf3:08298597 Events : 0.131712 Number Major Minor RaidDevice State 2 8 2 0 spare rebuilding /dev/sda2 1 8 18 1 active sync /dev/sdb2