On 06.10.2013 23:44, Phil Turmel wrote:
The answer is*NO*. That is not expected. But it does happen with timeout mismatches, and the double failure you experienced is a common result of error correction timeout mismatch. Timeout mismatch is where your drives are internally trying to retry reading a bad sector long after the OS has given up. It is always associated with consumer-grade hard drives in raid arrays.
Right, I knew that consumer HDDs did that, but didn't expect this to cause such mayhem. So the take out for me for this is: as soon as you see bad blocks on the drive, fail it, otherwise the whole array will probably get kicked out sooner or later. Or try and manually force the drive to reallocate, and then do a scrub.
You might want to search the list archives for various combinations of "error recovery", "scterc", "URE" and "timeout mismatch" for a full description of the problem and the recommended ways to avoid it.
Thanks, will do. -- Michał (Saviq) Sawicz <michal@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html