On Tue, Nov 18, 2008 at 9:47 AM, Les Mikesell <lesmikesell@xxxxxxxxx> wrote: >> Did you leave memtest86+ running for 2 days? I thought 1 or 2 cycles >> would be good enough? >> >> I'm hoping to pick-up the server in the next 2 hours then I can see >> what happens when I run memtest86+ or other tests > > Yes, apparently RAM errors can be subtle and only appear when certain > adjacent bit patterns are stored - or when the moon is in a certain phase or > something. > > -- > Les Mikesell > lesmikesell@xxxxxxxxx When we burn in machines to try to find errors we go with the day or two run also. The one fun thing that we found was that many times it was temperature related. It would crash in the rack but then when the machine was removed to a test bench it would not exhibit the issue. This is especially true when the machine under load would have both the CPU and the memory taxed but then during the testing we could only really tax one or the other using the existing tools. So blocking a bit of the air flow in the lab to heat up the case or being able to test in the same data center environment helped a lot. We have most errors show up either in the first 2 minutes of running a memory test or using one the prime number calculations or it will take a day or few to show up. Rob _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos