Hi, 2008/12/5 Redeeman <redeeman@xxxxxxxxxxx>: > On Fri, 2008-12-05 at 16:09 -0500, Justin Piszcz wrote: >> >> On Fri, 5 Dec 2008, Redeeman wrote: >> >> > On Fri, 2008-12-05 at 16:02 -0500, Justin Piszcz wrote: >> >> >> >> On Fri, 5 Dec 2008, Redeeman wrote: >> >> >> >>> Hello. >> >>> >> >>> I was looking at the PDFs linked to from the wiki, and found this: >> >>> http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf >> >>> >> >>> More specifically, section 4, starting on page 8. >> >>> >> >>> Am I understanding this correctly, in that with raid6, linux is capable >> >>> of detecting if the content on 1 disk is corrupted, and reconstruct it >> >>> from the remaining disks? >> >> >> >> I ran md/raid6 for awhile, do you mean remap the bad sector on the fly? >> >> Linux/md raid does not do this afaik. >> > >> > No, i mean, if one disk does silent corruption >> >> What would the error look like? Both md/Linux & in the 3ware manual >> recommend you run a 'check' across the raid at least once a week >> (3ware/raid-verify) and md/Linux in Debian runs a check once a month I >> believe to eliminate these issues. >> >> If you are asking whether a read error of a latent sector from the one >> disk will result it reading the data from the second disk that is a good >> question. > > im asking, if one disk in a raid6 setup suddenly decides to flip a few > bits in some bytes, will it be able to detect that in a scan, and > correct it? i cant see how it can do it on raid5, but maybe raid6? No, not really. I've been investigating silent corruption for a quite a while now, and it looks more or less like this. During a "check" action it'll be detected. During normal operation - it won't be detected. Normal (non-degraded) raid5/6 reads don't read parity (or Q syndrome), they just read data. So they have no idea that something went bad. Now, worse news is that you cannot really fix it automagically, even after detecting by a "check" procedure. A "repair" will overwrite parity and Q syndrome, with new values (new = calculated from what it seems to be data blocks). It is possible (by the theory of Q syndrome, per the article you linked) to detect which drive is doing a silent corruption with raid6 (and with some extra assumption, that just one drive is doing that). But it's not implemented. Greets, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html