Re: Mdadm server eating drives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 13, 2013 at 9:08 PM, Phil Turmel <philip@xxxxxxxxxx> wrote:
> Please interleave your replies, and trim unnecessary quotes.

No problem.

>> smartctl -l scterc,70,70 /dev/sdc
>> smartctl -l scterc,70,70 /dev/sdd
>> for x in /sys/block/sd[abef]/device/timeout ; do echo 180 >$x ; done
>
> This must be done now, and at every power cycle or reboot.  rc.local or
> similar distro config is the appropriate place.  (Enterprise drives
> power up with ERC enabled.  As do raid-rated consumer drives like WD Red.)

Seems that the drives themselves retained the ERC settings after a
reboot.  But I went ahead and put scterc and the timeouts in rc.local.

>
> Then stop and re-assemble your array.  Use --force to reintegrate your
> problem drives.  Fortunately, this is a raid6--with compatible timeouts,
> your rebuild will succeed.  A URE on /dev/sdd would have to fall in the
> same place as a URE on /dev/sde to kill it.

It worked.  Yer a wizard!  Thank you!

> Finally, after your array is recovered, set up a cron job that'll
> trigger a "check" scrub of your array on a regular basis.  I use a
> weekly scrub.  The scrub keeps UREs that develop on idle parts of your
> array from accumulating.  Note, the scrub itself will crash your array
> if your timeouts are mismatched and any UREs are lurking.

I'll definatly do this.  When you talk about mismatched timeouts, do
you mean matched between each of the components (as in
/sys/block/sdX/device/timeout) or between that driver timeout and some
device timeout per component?  If you mean between components, are my
timeouts matched now, even though I did not raise the 30 seconds on
the two drives with ERC?

On Fri, Jun 14, 2013 at 4:16 PM, Barrett Lewis
<barrett.lewis.mitsi@xxxxxxxxx> wrote:
> On Thu, Jun 13, 2013 at 9:08 PM, Phil Turmel <philip@xxxxxxxxxx> wrote:
>> Please interleave your replies, and trim unnecessary quotes.
>
> No problem.
>
>>> smartctl -l scterc,70,70 /dev/sdc
>>> smartctl -l scterc,70,70 /dev/sdd
>>> for x in /sys/block/sd[abef]/device/timeout ; do echo 180 >$x ; done
>>
>> This must be done now, and at every power cycle or reboot.  rc.local or
>> similar distro config is the appropriate place.  (Enterprise drives
>> power up with ERC enabled.  As do raid-rated consumer drives like WD Red.)
>
> Seems that the drives themselves retained the ERC settings after a
> reboot.  But I went ahead and put scterc and the timeouts in rc.local.
>
>>
>> Then stop and re-assemble your array.  Use --force to reintegrate your
>> problem drives.  Fortunately, this is a raid6--with compatible timeouts,
>> your rebuild will succeed.  A URE on /dev/sdd would have to fall in the
>> same place as a URE on /dev/sde to kill it.
>
> It worked.  Yer a wizard!  Thank you!
>
>> Finally, after your array is recovered, set up a cron job that'll
>> trigger a "check" scrub of your array on a regular basis.  I use a
>> weekly scrub.  The scrub keeps UREs that develop on idle parts of your
>> array from accumulating.  Note, the scrub itself will crash your array
>> if your timeouts are mismatched and any UREs are lurking.
>
> I'll definatly do this.  When you talk about mismatched timeouts, do
> you mean matched between each of the components (as in
> /sys/block/sdX/device/timeout) or between that driver timeout and some
> device timeout per component?  If you mean between components, are my
> timeouts matched now, even though I did not raise the 30 seconds on
> the two drives with ERC?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux