On Tue, Jan 29, 2013 at 12:18:39AM +0100, Wolfgang Denk wrote: [...] > This is what I think at the moment, as all my samples of data so fare > looked ok. Hi Wolfgang, my personal opinion would be to confirm *all* the data is OK, if you can. This will point to the parity calculation as error source, I guess. > > This could be in case of some software bug, which > > would be quite a surprise, I must say. > > Indeed. Guess how I feel... The lucky one... :-) > > Still, would be nice to check if the whole array > > is it this state or if, sooner or later, some > > knwon slot (with error) is found somewhere else. > > Checks still running. I see two things: > > - on the array where I was running "repair" before, raid6check reports > no errors so far - but still there is a mismatch_cnt = 362731480 > raid6check is still running. As mentioned, the "repair" reports the number of repairs it did, so unless you ran a check after than, the number is expected, I guess. > - on the second machine, I have 558579 lines out output, 176 of which > are errors of type "Error detected at : disk slot unknown"; no other > errors reported so far. raid6check is still running. Nah, ja, it is slow, I know... In any case, as wrote in another post, "unknown" means both parities are wrong and a suitable, guilty, slot cannot be found. So, either both parities are wrong and only them (best case scenario), or more than one disk has corruped data on the same stripe. > - on the third machine, I have 5512894 lines out output, 1599431 of > which are errors of type "Error detected at : disk slot unknown"; no > other errors reported so far. raid6check is still running. > > This smells really bad as if parity computation was broken... Uhm, as mentioned, it would be nice to find a specific error slot... Well, not so nice, but this would point to an HW problem. > OK, add more hardware details... > > A: Supermicro X8SAX mainboard, Core i7 CPU 950 @ 3.07GHz, 24 GB RAM > H: Supermicro X8ST3 mainboard, Xeon CPU W3565 @ 3.20GHz, 24 GB RAM > X: Supermicro X8SAX mainboard, Core i7 CPU 950 @ 3.07GHz, 24 GB RAM What does the kernel log says about the choosen RAID6 algorithm? There should be some information with "dmesg". bye, -- piergiorgio -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html