Re: detection/correction of corruption with raid6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2008/12/5 Peter Rabbitson <rabbit+list@xxxxxxxxx>:> Michał Przyłuski wrote:>> Hi,>>>> 2008/12/5 Redeeman <redeeman@xxxxxxxxxxx>:>>> On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote:>>>> On Fri, 5 Dec 2008, Redeeman wrote:>>>>>>>>> On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote:>>>>>> On Fri, 5 Dec 2008, Redeeman wrote:>>>>>>>>>>>>> Hello.>>>>>>>>>>>>>> I was looking at the PDFs linked to from the wiki, and found this:>>>>>>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>>>>>>>>>>>>> More specifically, section 4, starting on page 8.>>>>>>>>>>>>>> Am I understanding this correctly, in that with raid6, linux is capable>>>>>>> of detecting if the content on 1 disk is corrupted, and reconstruct it>>>>>>> from the remaining disks?>>>>>> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly?>>>>>> Linux/md raid does not do this afaik.>>>>> No, i mean, if one disk does silent corruption>>>> What would the error look like?  Both md/Linux & in the 3ware manual>>>> recommend you run a 'check' across the raid at least once a week>>>> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I>>>> believe to eliminate these issues.>>>>>>>> If you are asking whether a read error of a latent sector from the one>>>> disk will result it reading the data from the second disk that is a good>>>> question.>>> im asking, if one disk in a raid6 setup suddenly decides to flip a few>>> bits in some bytes, will it be able to detect that in a scan, and>>> correct it? i cant see how it can do it on raid5, but maybe raid6?>>>> No, not really.>> I've been investigating silent corruption for a quite a while now, and>> it looks more or less like this.>> During a "check" action it'll be detected. During normal operation ->> it won't be detected.>> Normal (non-degraded) raid5/6 reads don't read parity (or Q syndrome),>> they just read data. So they have no idea that something went bad.>> Now, worse news is that you cannot really fix it automagically, even>> after detecting by a "check" procedure. A "repair" will overwrite>> parity and Q syndrome, with new values (new = calculated from what it>> seems to be data blocks).>>>> It is possible (by the theory of Q syndrome, per the article you>> linked) to detect which drive is doing a silent corruption with raid6>> (and with some extra assumption, that just one drive is doing that).>> But it's not implemented.>>>> I'd like to shamelessly bring in an older related thread:> http://marc.info/?l=linux-raid&m=120605458309825> http://marc.info/?l=linux-raid&m=120618020817057>> Maybe someone will get inspired, and will actually write the damned thing :)
I concur. Even without a "fix", just printing information which diskis suspected of doing silent corruption will be helpful. One can atleast, fail the disk, and get rid of it. Still better than taking wildguesses what went wrong. I'm a silent corruption maniac myself,keeping md5's of most bigger/more important files, so my judgmentmight not be fair.
Also, it seems the feature is being asked about about 3-4 times ayear, which is probably the second most requested feature afternumerous reshape variations.Regards,Mike˙ôčş{.nÇ+?ˇ?Ž?­?+%?Ë˙ąéÝśĽ?w˙ş{.nÇ+?ˇĽ?{ąţś˘wř§ś?ĄÜ¨}Š?˛Ć zÚ&j:+v?¨ţřŻůŽwĽţ?ŕ2?Ţ?¨č­Ú&˘)ߥŤaśÚ˙˙űŕzżäzšŢ?ú+?ů???ݢj˙?wčţf


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux