On 09/27/2016 04:16 AM, Brad Campbell wrote: > On 27/09/16 09:08, Benjammin2068 wrote: >> > >> Also, I just did a "repair" and the mismatch is now back to 8... which seems like a suspicious number considering the filesystem on this new drive (because it's a WD10 series with 4096byte sectors) has a slightly larger FS than the Samsung HD103SJ (and Seagate equivalents) in the array too. > > See that is a bad thing to do if you even remotely suspect you have a problem. All a "repair" does is check the parity on a stripe and if there is a mismatch it re-writes it. You are writing to an array that apparently has issues. > > I'd be checking the filesystem and file contents very carefully for corruption, and running several sequential check actions to keep an eye on the mismatch count. > Yep. Once I reconfig'd the hardware and checked the cables in the system on boot the number is now 0. (which makes sense at boot - but is creepy) I put a monitor into munin which I'll be watching closely for when it changes. BUT... I think I did find the problem. The card was running hot due to airflow. That's been remedied (I hope) -- the temp sensor on the heat-sink for the PCIe controller now sits around 45'C which is fine. Before it was >= 60'C . :O Thanks again everyone, -Ben p.s. The Linux RAID Wiki doesn't cover mismatch_cnt at all.... would be kinda nice considering how critical (or not) this is... and what to do about it. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html