[snip] On 10/13/2014 08:42 PM, NeilBrown wrote:
Write errors start happening. You should only get a write error if no writes successfully completed to in_sync, non-faulty devices. It is possible that the write to sdg3 completed before it was marked in-sync, and the write to sdh3 completed after it was marked as faulty. How long after recovery completes do you fail the next device? The logs suggest it is the next second, which could be anywhere from 1msec to 1998 msecs. NeilBrown
FYI Neil: Running through my logs, I noticed 6 of these failures in my testing over the past few days. All "recovery completed, wait a second, fail the other drive" cases. Same signature in the logs. Apparently things kept working until
the filesystem tripped on its journal and fell over. So the upshot is this seems reasonably reproducible. -Bill -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html