Re: Raid5 hang in 3.14.19

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[snip]
On 10/13/2014 08:42 PM, NeilBrown wrote:
Write errors start happening.

You should only get a write error if no writes successfully completed to
in_sync, non-faulty devices.
It is possible that the write to sdg3 completed before it was marked in-sync,
and the write to sdh3 completed after it was marked as faulty.
How long after recovery completes do you fail the next device?
The logs suggest it is the next second, which could be anywhere from 1msec
to 1998 msecs.


NeilBrown


FYI Neil: Running through my logs, I noticed 6 of these failures in my testing over the past few days. All "recovery completed, wait a second, fail the other drive" cases. Same signature in the logs. Apparently things kept working until
the filesystem tripped on its journal and fell over.

So the upshot is this seems reasonably reproducible.

-Bill
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux