Hi Matthias, On 12/03/2017 01:14 PM, Matthias Walther wrote: > Hello, > > Am 03.12.2017 um 18:20 schrieb Phil Turmel: >> Very good. At some point you need to replace the desktop drive -- it's >> unsafe to use in a raid array -- but it doesn't look like it's blowing >> up at the moment. Use the following workaround on every boot until you >> replace it: >> >> echo 180 > /sys/block/sde/device/timeout >> >> Search the archives for "timeout mismatch" to see many discussions on >> why that drive is a time bomb. > > this is an interesting point. As far as I understand it, there's no > difference between a) the device tells the kernel, that an error > occurred (ERC) or b) the kernel just waits three minutes. >From MD raid's perspective, as long as the link doesn't time out, no. Many services that one might want to use with such a server will have problems with a 3-minute filesystem freeze, which is why I highly recommend replacing the drives with something that'll respond quicker. > From my point of understanding, I see no reason to avoid those disks. > Just raise this timeout to 180 on all disks. Even those with ERC can be > set to 180 seconds, because on some mainboards the order of sdX changes > every boot. On your home nas it doesn't really matter if there's an > access delay. This is of course not acceptable on enterprise systems. No, lots of protocols can't wait that long. Lots of humans can't wait that long either, and will start physical interventions. > By the way, the kernel doesn't just easily throw the device out. From my > experiences it hard resets the link and completely reinitializes the > device. Only if that fails, the raid will be degraded and if this fails,> the device probably has a problem and should be replaced. MD raid tries to fix read errors. When a read returns an error, MD retrieves the data from a mirror (raid1, raid10) or reconstructs it from parity and/or syndrome (raid4,5,6) and then writes it back to the problem sector. This is entirely appropriate as large modern hard drives do occassionally experience transient read errors. Transient read errors are fixable by writing new content to that sector location. Even if the error is not transient, modern drives use the write operation to verify that problem and then relocate the sector. If the link resets because the driver timed out before the device responded, then MD gets another error message *while* the link is resetting. The follow-up write to correct the sector fails immediately because the link is down. The *write error* kicks the drive out. A quick burst of read errors will kick out a drive (20 in one hour), or a steady stream of read errors (10 per hour sustained), or *any* write error. > I run a raid-6 on six really cheap old second hand 4 TB drives and never > had an issue with that in the past two years. I had no real failures and > no accidentally or prematurely dropped devices. Mdadm just runs. And > this raid writes about 50 GB each and every single day and never goes to > sleep. This is what differs mdadm from hardware raid controllers, which > really shouldn't used with non ERC drives due to exactly that timing > problem. If you are using the driver timeout workaround, of course you would see your array collapse. And for household use, you probably don't care if your movie playback freezes for the occassional minute or two. > Though I run a check every month, where all data is read, just to make > sure it doesn't rot on the discs. During scrubs, the long timeout on a URE won't impact the filesystem, so your users are even less likely to notice. This is very good practice. > In my opinion a (monitored) raid-6 on > old, cheap non ERC drives is safer, than a raid-5 on „premium > overpriced“ drives. No question about it. Raid6 is *always* safer than raid5. That doesn't mean non-ERC drives are a good idea. > Never forget, it's call raid - random array of > inexpensive disks. The original name is "Redundant Array of Inexpensive Disks". The current standard uses "Independent" instead of "Inexpensive" because the standards body is made up of manufacturers. /-: > In cynical words, I see it this way: The hdd and nas manufactures came > together and found a way to push the prices up. Oh, I'm pretty cynical. You should read my posts in 2011 when I worked all this out -- after Seagate screwed me by taking scterc out of their desktop drives. But timeout mismatch is a real problem. The NAS drives didn't exist as an option back then, and I'm sure it was complaints like ours that caused that niche to come into existence. At a 10% or so price premium. (Vs. 2x pricing for enterprise drives.) > Regards, > Matthias Phil -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html