On Thu, Jun 13, 2013 at 9:08 PM, Phil Turmel <philip@xxxxxxxxxx> wrote: > Please interleave your replies, and trim unnecessary quotes. No problem. >> smartctl -l scterc,70,70 /dev/sdc >> smartctl -l scterc,70,70 /dev/sdd >> for x in /sys/block/sd[abef]/device/timeout ; do echo 180 >$x ; done > > This must be done now, and at every power cycle or reboot. rc.local or > similar distro config is the appropriate place. (Enterprise drives > power up with ERC enabled. As do raid-rated consumer drives like WD Red.) Seems that the drives themselves retained the ERC settings after a reboot. But I went ahead and put scterc and the timeouts in rc.local. > > Then stop and re-assemble your array. Use --force to reintegrate your > problem drives. Fortunately, this is a raid6--with compatible timeouts, > your rebuild will succeed. A URE on /dev/sdd would have to fall in the > same place as a URE on /dev/sde to kill it. It worked. Yer a wizard! Thank you! > Finally, after your array is recovered, set up a cron job that'll > trigger a "check" scrub of your array on a regular basis. I use a > weekly scrub. The scrub keeps UREs that develop on idle parts of your > array from accumulating. Note, the scrub itself will crash your array > if your timeouts are mismatched and any UREs are lurking. I'll definatly do this. When you talk about mismatched timeouts, do you mean matched between each of the components (as in /sys/block/sdX/device/timeout) or between that driver timeout and some device timeout per component? If you mean between components, are my timeouts matched now, even though I did not raise the 30 seconds on the two drives with ERC? On Fri, Jun 14, 2013 at 4:16 PM, Barrett Lewis <barrett.lewis.mitsi@xxxxxxxxx> wrote: > On Thu, Jun 13, 2013 at 9:08 PM, Phil Turmel <philip@xxxxxxxxxx> wrote: >> Please interleave your replies, and trim unnecessary quotes. > > No problem. > >>> smartctl -l scterc,70,70 /dev/sdc >>> smartctl -l scterc,70,70 /dev/sdd >>> for x in /sys/block/sd[abef]/device/timeout ; do echo 180 >$x ; done >> >> This must be done now, and at every power cycle or reboot. rc.local or >> similar distro config is the appropriate place. (Enterprise drives >> power up with ERC enabled. As do raid-rated consumer drives like WD Red.) > > Seems that the drives themselves retained the ERC settings after a > reboot. But I went ahead and put scterc and the timeouts in rc.local. > >> >> Then stop and re-assemble your array. Use --force to reintegrate your >> problem drives. Fortunately, this is a raid6--with compatible timeouts, >> your rebuild will succeed. A URE on /dev/sdd would have to fall in the >> same place as a URE on /dev/sde to kill it. > > It worked. Yer a wizard! Thank you! > >> Finally, after your array is recovered, set up a cron job that'll >> trigger a "check" scrub of your array on a regular basis. I use a >> weekly scrub. The scrub keeps UREs that develop on idle parts of your >> array from accumulating. Note, the scrub itself will crash your array >> if your timeouts are mismatched and any UREs are lurking. > > I'll definatly do this. When you talk about mismatched timeouts, do > you mean matched between each of the components (as in > /sys/block/sdX/device/timeout) or between that driver timeout and some > device timeout per component? If you mean between components, are my > timeouts matched now, even though I did not raise the 30 seconds on > the two drives with ERC? -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html