Scott Silva wrote: > on 5-27-2008 10:16 AM Ross S. W. Walker spake the following: > > sbeam wrote: > > > On Tuesday 27 May 2008 11:39, Scott Silva wrote: > > > > > > > > Running memtest for 24 hours should be enough to test the ram. > > > > A 3ware 7006 is a fairly old card. Does it have the latest bios available > > > > from 3ware? > > > > > > > > You could always eliminate the 3ware controller by installing a drive on > > > > whatever built in controller it has. > > > > > > this is a production server, so running an extended memtest not going to > > > happen. But I can swap it out and put it in a backup system to do the test. > > > It's beginning to look a lot like a RAM issue as I have now seen a couple > > > segfaults from programs that have always run fine. Every kernel panic message > > > is different (crashed again 1 hour ago). Fans and case temp are nominal. > > > > > > the 3ware card was just purchased last month, it has the latest firmware and > > > bios installed. > > > > > > the memory is from PQI - supposed to be an OK brand right? it has a lifetime > > > warranty... heh > > > > > > next steps... HA and fault-tolerant clustering, per the adjacent thread... > > > this is the cautionary tale come to life. > > > > It would be great if there were a simple machine that you could plug > > a bunch of dimms of varying types into and it will perform high-speed > > tests on them continuously and flag ones that show an error. > > > > Then you could test all memory modules thoroughly before putting them > > into production servers (or any server for that matter). > > That is why a good long burn in test is a worthwhile thing to > plan for. That is unless you need to rush a replacement > server out quickly. Yes, but even then, with say 16GB or 32GB of memory it happens that some errors just fall through the cracks. > I usually run memtest86 for 48 hours, and then run a burn in > test with some load. > > There are simple machines for testing memory, but they tend > to be very expensive and time consuming. Manufacturers can't > take the time to do thorough memory tests before they ship, > so they usually do some quick go-nogo tests and depend on > their warranty dept. to do the hard tests. > > I don't think it would pay for anyone to buy one of these > testers, unless you are a very large var like Dell or HP. It > is easier (and probably cheaper) to just send new ram out and > send the returns back to your supplier for them to check. I actually found a memory testing system for around $4K, yes it's about the cost of a well equiped server, but if it works well it should earn it's keep pretty quick. It's called RAMCHECK, I priced out the DDR/DDR2 unit, but there is add-ons for SODIMM, SDRAM, EDO, if you got it fully loaded I suspect it would be around $5K. Company's called Innovations http://www.memorytesters.com/ They're Government registered and CDW seems to resell it, so it isn't completely suspect. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos