Doing the periodic check does not prevent corruption of read() data
though (RAID6 case). Copied files may be corrupted, even though the
RAID would eventually fix itself after a repair is done.
Yes, there is a performance penalty, but data integrity is also
improved. Paranoid mode should probably not be the default, but I would
like the choice to improve data integrity at the expense of some small
speed penalty. ZFS implements this anti-corruption checking by using
checksums on their data. We don't have a simple checksumming mechanism
in md-raid, but we do have the full stripe data available ready for
verification.
BTW, the idea of a daily repair operation doesn't work when it takes 14
hours to repair a large RAID. That would only leave 10 hours of each
day for normal speed access. I schedule repairs weekly, though.
--Bart
On 6/11/2014 2:53 AM, Roberto Spadim wrote:
Hi
IMHO
For silent corrupt i think it's better a periodic raid check
instead of a paranoid mode
Normally a silent corrupt occurs with an 'old disk' or with old data,
but it don't occurs at every disk read (must check disk studies)
I think a 'paranoid' mode is nice, but i think it will reduce all
system performace, maybe an crond daily check is better than a 'all
read, check' (paranoid)
Em quarta-feira, 11 de junho de 2014, Bart Kus <me@xxxxxxxx
<mailto:me@xxxxxxxx>> escreveu:
Hello,
As far as I understand, md-raid relies on the underlying devices
to inform it of IO errors before it'll seek redundant/parity data
to fulfill the read request. I have, however, seen certain hard
drives report successful reads while returning garbage data.
Is it possible to set md-raid into a paranoid mode, in which it
reads all available data and confirms integrity? Here's how it
would work:
RAID6: read data + parity 1 + parity 2. If 1 of the 3 mismatches,
correct it, and write corrected data to the corrupt source. Log
the event. If all 3 disagree, alert user somehow.
RAID5: read data + parity. If they mismatch, alert user somehow.
RAID1: read data 1 + data 2. If they mismatch, alert user somehow.
You can see this is mostly useful for RAID6 mode, where there is a
chance at automated recovery. However, it can also be used to
prevent silent data corruption in the other modes, by making it
not silent.
--Bart
--
To unsubscribe from this list: send the line "unsubscribe
linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Roberto Spadim
SPAEmpresarial
Eng. Automação e Controle
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html