Re: Suboptimal raid6 linear read speed

pg@xxxxxxxxxxxxxxxxxxxx (Peter Grandi) · Sun, 20 Jan 2013 21:50:42 +0000

[ ... ]

>> [ ... ] there are a few narrow cases where the rather skewed
>> performance envelopes of RAID5 and even of RAID6 match
>> workload and budget requirements. But it takes apparently
>> unusual insight to recognize these cases, so just use RAID10
>> even if you suspect it is one of those narrow cases.

> In your whole post you never touched on URE rates (well you
> did, but you didn't seem to this was a problem).

They are a big problem, especially because in a typical RAID set
they are not uncorrelated, either with each other or the
environment, and I wrote a lot about that.

The *absolute* level of URE rates matters less, but some people
on this thread have noticed that they are not that low, compared
with whole-disk reading, even the perhaps somewhat optimistic
ones quoted by manufacturers.

> I'm using RAID6 because I don't really care about performance,

That's a pretty unusual case, but perhaps that falls under the
"know better" qualification above, except that:

> but I do want to be able to fail one drive and have scattered
> URE handled while rebuilding.

RAID6 is not appropriate for this either.

Perhaps I have not been clear in my earlier comment, but the URE
rate is not a constant you can just (euphemism) uncritically
read from a spec sheet.

Let's try shouting:

  * THE URE RATE DEPENDS ON ENVIRONMENTAL FACTORS AND COMMON
    MODES OF FAILURE (INCLUDING THE AGE OF THE DRIVE).

  * In a typical incomplete or rebuilding RAID6 the aggregate
    URE rate is much higher than the single drive URE rate or
    even the RAID10 URE rate.

I also sometimes suspect that manufacturers quote ideal numbers;
for example I just had a look at some user manuals for a few
"enterprise" and "desktop" Seagate (they do very detailed
manuals) and their "annualized return rate" is usually around
0.4-0.7%, and many large sites report annual failure rates of
around 2-4%.

> I have had scattered URE hit me numerous times over the past
> 10 years.

That is indeed a big problem, and it is rare and good that you
don't underestimate it.

> With RAID6 they are handled nicely even with a failed drive.

Unfortunately RAID6 correlates failure modes across drives
because of the !"£$%^ parity, and increases them by stressing
all of them hard while the RAID6 set is incomplete or syncing.

Incomplete or syncing RAID6 not only has pretty bad speed, which
you don't care about, but drives up error rates, and you should
care about that.

Sure, most of the time one can replace the failed drive and sync
before something bad happens, but an incomplete or syncing RAID6
has at some point in the life of the RAID set a much higher
chance of 2 or more failures...

BTW important qualification as to all this: when people mention
BERs they really imply that this discussion is about *sector*
UREs, and *single*-sector ones in particular, which matters a
fair bit.

Because a typical RAID6 under the stress of being incomplete or
syncing can have something far worse than single-sector UREs, it
can trigger much more brutal mechanical or electronic failures.

While single-sector UREs are in most cases not such a big deal,
as usally losing a single sector's content allows for nearly
complete recovery; some/most drives IIRC can return the sector
content that has been reconstructed, which often is wrong in
only a few bits.

> If I cared about performance, I would either do what was
> discussed earlier in the thread (use smaller enterprise drives
> with better BER) in RAID10,

Whether "enterprise" drives effectively have a better URE is an
experimental question that is difficult to settle.

> or I would use threeway mirror RAID1 and use lvm to vg several
> RAID1:s together.

This particular case is not caring much about performance either
because linear (concat) is not as quick for most workloads as a
RAID0.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html