On Thu, Aug 2, 2012, at 13:33, Phil Turmel wrote: > You really do need to have a process check mismatch_cnt after your > weekly check completes. With Fedora, I get an email Monday morning, after the raid-check, which warns of a non-zero mismatch_cnt. > Depends. If you use "repair", bad data will be propagated. If you use > "check", it'll just be reported. Ah, okay, good. I thought I'd read here a while back that "check" & "repair" do the same thing. > I've seen a great deal of good advice here, but nothing about the system > component least likely to be protected in an "economy" system: > RAM. Does your Mobo have ECC ram? Good point. It does not. Might be time for me to upgrade to a mobo with ECC support. > does your kernel support logging, and are you monitoring the > machine check log? klogd is not running, but I think the latest rsyslog handles the kernel messages. There was nothing in the logs related to my corruption issues, however. > Hard drives write extensive ECC payloads to catch corruptions there; > SATA and SAS protocols have CRC checks on every frame transferred; the > PCIe bus uses CRC checks on each lane, with low-level encoding very > similar to SATA. Even modern processors are using PCIe-style encoded Thanks, this is good info, and kind of gets at my thinking when I posted my initial question. In a typical consumer hardware setup, with a current linux kernel, do I have to take any steps to enable these kinds of checks? Can the kernel log any failed checks at the levels you mention? I guess my confusion with my silent data corruption issues stems from my naive assumption that all the various data transfers happening would have some way of detecting or flagging the bad reads as they happened. But maybe as you suggest, my issue is related to memory, and ECC might help in the future? Thanks, matt -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html