Hey Phil, thanks for your swift reply. > Disks don't need replacing on occassional read errors, because they are > normal. Typical consumer-grade hard drives quote a unrecoverable read > error rate of under 1x10^-14. That works out to, on average, one URE > every 12.5 TB read. On large drives and large arrays of drives, that's > just a few reads from end to end. This makes sense. But does it apply here, given the flood of read errors in my dmesg in just a single scrub? The probability for that many errors for a single pass over 3 GB seems very low. I also read with interest your mentions of the timeout problem as well as: https://raid.wiki.kernel.org/index.php/Timeout_Mismatch http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-timeouts/ Could the timeout problem cause the flood of read errors? I am not sure how to decide that from the dmesg output. On the timeout topic, the disks in question are WD Red 3TB, and I get: $ smartctl -l scterc /dev/sdb SCT Error Recovery Control: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) Another data point possibly relevant: Even after I wait a many minutes longer than the problematic 2 minutes timeout threshold mentioned, a short self test with `smartctl -t short` immediately turns up read errors for both disks: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Disk 1: # 1 Short offline Completed: read failure 40% 16398 7501728 Disk 2: # 1 Short offline Completed: read failure 50% 16398 1758544 I interpret this as the disks having real problems as opposed to UREs according to the specified error rate. What do you think? Thanks! Niklas