On 10/30/2016 12:19 PM, Andreas Klauer wrote: > On Sun, Oct 30, 2016 at 08:38:57AM -0700, Marc MERLIN wrote: >> (mmmh, but even so, rebuilding the spare should have cleared the bad blocks >> on at least one drive, no?) > > If n+1 disks have bad blocks there's no data to sync over, so they just > propagate and stay bad forever. Or at least that's how it seemed to work > last time I tried it. I'm not familiar with bad blocks. I just turn it off. I, too, turn it off. (I never let it turn on, actually.) I'm a little disturbed that this feature has become the default on new arrays. This feature was introduced specifically to support underlying storage technologies that cannot perform their own bad block management. And since it doesn't implement any relocation algorithm for blocks marked bad, it simply gives up any redundancy for affected sectors. And when there's no remaining redundancy, it simply passes the error up the stack. In this case, your errors were created by known communications weaknesses that should always be recoverable with --assemble --force. As far as I'm concerned, the bad block system is an incomplete feature that should never be used in production, and certainly not on top of any storage technology that implements error detection, correction, and relocation. Like, every modern SATA and SAS drive. Phil -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html