Re: Mdadm server eating drives

Phil Turmel <philip@xxxxxxxxxx> · Thu, 13 Jun 2013 22:08:40 -0400

Hi Barrett,

Please interleave your replies, and trim unnecessary quotes.

On 06/13/2013 08:19 PM, Barrett Lewis wrote:
> Sorry for the delay, I wanted to let the memtest run for 48 hours.
> It's at 49 hours now with zero errors, so memory is pretty much ruled
> out.
> 
> As far as power, I would *think* I have enough power.  The power
> supply is a 500w Thermaltake TR2.  It's powering an Asrock z77 mobo
> with an i5-3570k, and the only card on it is a dinky little 2 port
> sata card my OS drive is on (the RAID components are plugged into the
> mobo).  Eight 7200 drives and an SSD.  Tell me if this sounds
> insufficient.
> 
> Phil, when you say "what you are experiencing", what do you mean
> specifically?  The dmesg errors and drives falling off?  Or did you
> mean the beeping noises (since thats the part you trimmed)?

Drives dropping out when they shouldn't, and smartctl says "PASSED".
This is *unavoidable* when you have mismatched device and driver timeouts.

> Here is the data you requested
> 
> 1) mdadm -E /dev/sd[a-f]       http://pastie.org/8040826

/dev/sdd and /dev/sde have old event counts ...

> 2) mdadm -D /dev/md0          http://pastie.org/8040828

... matching the array report ...

> 3)
> smartctl -x /dev/sda                   http://pastie.org/8040847

Ok, but no error recovery support (typical of green drives).

> smartctl -x /dev/sdb                   http://pastie.org/8040848

Ok, green again.  No ERC.

> smartctl -x /dev/sdc                   http://pastie.org/8040850

Ok, with ERC support, but disabled.  Not a green drive.

> smartctl -x /dev/sdd                   http://pastie.org/8040851

Not Ok.  A few relocations, a couple pending errors.  ERC support
present but disabled.

> smartctl -x /dev/sde                   http://pastie.org/8040852

Not Ok.  No relocations, but several pending errors.  No ERC.

> smartctl -x /dev/sdf                   http://pastie.org/8040853

Ok, but no ERC.

> 4) cat /proc/mdstat                   http://pastie.org/8040859
> 
> 5) for x in /sys/block/sd*/device/timeout ; do echo $x $(< $x) ; done
>                  http://pastie.org/8040870

All timeouts are still the default 30 seconds.  With enabled ERC
support, these values must be two to three minutes.  I recommend 180
seconds.  Your array *will not* complete a rebuild with dealing with
this problem.

> 6) dmesg | grep -e sd -e md                   http://pastie.org/8040871
> (note that I have rebooted since the last dmesg link I posted (where
> two drives failed) because I was running memtest, if I should do dmesg
> differently, let me know)
> 
> 7) cat /etc/mdadm.conf                   http://pastie.org/8040876

I generally simplify the ARRAY line to just the device and the UUID, but
it is ok as is.

> Adam, I wouldn't be opposed to spending the money on a good sata card,
> but I'd like to get opinions from a few people first.  Any suggestions
> on a good one for mdadm specifically?

No need.  Just fix your timeouts.  For the two devices that support ERC,
you need to turn it on:

> smartctl -l scterc,70,70 /dev/sdc
> smartctl -l scterc,70,70 /dev/sdd

For the others, you need long timeouts in the linux driver:

> for x in /sys/block/sd[abef]/device/timeout ; do echo 180 >$x ; done

This must be done now, and at every power cycle or reboot.  rc.local or
similar distro config is the appropriate place.  (Enterprise drives
power up with ERC enabled.  As do raid-rated consumer drives like WD Red.)

Then stop and re-assemble your array.  Use --force to reintegrate your
problem drives.  Fortunately, this is a raid6--with compatible timeouts,
your rebuild will succeed.  A URE on /dev/sdd would have to fall in the
same place as a URE on /dev/sde to kill it.

Upon completion, the UREs will either be fixed or relocated.  If any
drive's relocations reach double digits, I'd replace it.

Finally, after your array is recovered, set up a cron job that'll
trigger a "check" scrub of your array on a regular basis.  I use a
weekly scrub.  The scrub keeps UREs that develop on idle parts of your
array from accumulating.  Note, the scrub itself will crash your array
if your timeouts are mismatched and any UREs are lurking.

I'll let you browse the archives for a more detailed explanation of
*why* this happens.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html