On Fri Jan 11, 2013, Chris Murphy wrote: > On Jan 11, 2013, at 10:39 AM, Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > They probably have a high ERC time out as all consumer disks do so you > > should also check /sys/block/sdX/device/timeout and make sure it's not > > significantly less than the drive. It may be possible for smartctl or > > hdparm to figure out what the drive ERC timeout is. > > > > http://cgi.csc.liv.ac.uk/~greg/projects/erc/ > > Actually what I wrote is misleading to the point it's wrong. You want the > linux device time out to be greater than the device timeout. The device > needs to be allowed to give up, and report back a read error to linux/md, > so that md knows it should reconstruct the missing data from parity, and > overwrite the (obviously) bad blocks causing the read error. > > If the linux device time out is even a little bit less than the drive's > timeout, md never gets the sector read error, doesn't repair it, since > linux boots the whole drive. Now instead of repairing a few sectors, you > have a degraded array on your hands. Usual consumer drive time outs are > quite high, they can be up to a couple minutes long. Linux device time out > is 30 seconds. > Hm, ok. I'll look into that, and set those up properly. Thanks. > Chris Murphy-- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thomas Fjellstrom thomas@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html