Re: On URE and RAID rebuild - again!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Il 2014-08-05 21:01 Piergiorgio Sartor ha scritto:

This means they, who wrote the article, did not
really *tested* what they wrote.
Which already tells us a lot about the quality
of the article itself.

True. Problem is that the web is full of similar articles, which sounded waaaaay to much "suspicious" is what they said.

What's the difference between "probability" and
"statistical record"?
Is not one calculated with the other?

Premise: I am not a statistical expert, so maybe I used the wrong terms and/or my entire reasoning is flawed.

I am trying to imagine _how_ the various vendors arrive at the claimed number and _how much_ we have confidence in URE rate. _If_ for some reason (eg: magnetical interference during write and/or rest) a fixed "wrong read" probability exists _and_ _if_ it is correct to consider each sector read as totally indipendent events, HDD manufacturer may have a quite precise formula from which URE rate is obtained.

If, on the other hand, they "simply" observe how a big drive population reacts over time, maybe we can expect bigger variations between drivers.

I'm just speculating here; what really worried me was "you can't read 6 times your 2 TB drive" argument :)

I'm to lazy to try to understand what 3*10^14 is.
What is it?

I have read about 40 TB of data, or 320 Tb. 10^14 is 12.5 TB or 100 Tb, if you prefer. So 3*10^14 simply is the numnber of bit that I read (URE is expressed as 1 event over 10^14 bit, so I wonder that make sense to use the same scale here).

I'm under the impression you did not grasp the
concept of probability is such contex.
Given that it is not clear how the manufacturers
compute their numbers, both cases you describe
are the same.
All the possible conditions are included in the
probability computation.

I can see your point...

You can state: under worst case scenario, *each*
bit has a probability of 10E-14 of being wrong.
What does this mean?

... and _this_ is what really interested me. Manufacturer publish URE rate as "max" values, so should be reasonable to assume that they are worst-case scenario. If this is the case, we can be quite sure that our URE rate will be lower then published specs (assuming that drive are deployed with care).

On the other hand, in some articles and even in this mailing list I read that published URE rate really are a "max of various means" and do not represent true worst-case scenario.

As already wrote by others, it is not clear what
that number (10E-14) means.
A common understanding could be, as stated above,
each bit has a *probability* of 10E-14 of being wrong.

Practically, it does *not* mean that reading 10E14 bit
will deliver one bit wrong sistematically.

But if the spec is representative of normal usage scenario, reading 40 TB of data with URE of 10^-14 has very high probabily to return a bad read (>95%) ...

Furthermore, as already again stated, very likely
an "average" HDD has much lower URE probability.

This is reassuring :)


Is this pure curiosity from your side or are
you trying to achieve something?

There is a report, from CERN I think, provinding
real world statistics about HDD problems.

http://storagemojo.com/2007/09/19/cerns-data-corruption-research/

bye,

Yes, I saw this article and read it with great interest. After all it seems that the greater part of data corruption is due to firmware/kernel/driver bug, and that URE rate play a minor role here.

Thank you very much guys. I'm sorry to boring you with all these questions, but I'm just trying to learn something!
Regards.


--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux