On Nov 18, 2008, at 6:05 PM, Les Mikesell <lesmikesell@xxxxxxxxx> wrote:
nate wrote:
Les Mikesell wrote:
Yes, apparently RAM errors can be subtle and only appear when
certain
adjacent bit patterns are stored - or when the moon is in a certain
phase or something.
Don't forget cosmic rays
http://adsabs.harvard.edu/abs/1978ITNS...25.1166P
Yeah, but those don't stop when you replace the faulty RAM... Mine
did, but the errors committed to disk kept randomly re-appearing
mysteriously as the reads from the RAID1 alternated afterwards.
Ah, memory mapped files, another very good reason to use ECC with
large memory machines.
Also if you identify bad memory and use software RAID1, it's better to
break the mirror, fsck and fix, then rebuild the mirror as there is no
data integrity test on RAID1.
-Ross
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos