On Oct 28, 2012, at 2:34 PM, Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > That's not true. A drive that can sit there like a bump on the log for 2 minutes before it issues an actual read failure on sector(s) means mdadm is likewise waiting, and everything above it is waiting. That's a long hang. I'd not be surprised to see users go for hard reset with such a system hang after 45 seconds let alone 2 minutes. > > Ideally what you'd get is a quick first error recovery with a clean normally operating array. As soon as the first drive fails, the system would set the remaining drives to a slightly longer error recovery time so that you don't get nearly as quick error recovery on the remaining drives - ask them to try a little harder before they error out. If you get another read error, there is no mirror or parity to rebuild from. Best to try a little longer in such a degraded state. In fact the long error recovery of consumer drives *prevents* automatic correction by md. At least it significantly delays the correction. That drive, if it can recover a sector in 30 seconds (let alone 2 minutes) instead of failing it, will not be corrected; md won't get an alternate from mirror or from parity, and won't overwrite that obvious flakey sector. So instead you get a propensity for bad sector accumulation, even in the case of regular check scrubs. For any serious use I just wouldn't use the Greens, without very non-consumer like scrubs, extended smart tests, and cycling out drives so they could be ATA Enhance Secure Erase nuked say once a year or maybe more often. And a rigorous backup. With that kind of expertise and dedication should come a better budget for a better drive. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html