On Mon, 14 Feb 2011, Nico Kadel-Garcia wrote: > But the accumulated costs of the higher end motherboard, memory, > shortage of space for upgrades in the same unit, the downtime at the > BIOS to reset the "disabled by default" ECC settings in the BIOS, and > the system monitoring to detect and manage such errors add up *really > fast* in a moderate sized shop. Really? Tweaking a BIOS setting is a silly argument, you'll typically find it's configured by default, and if you can't get BIOS settings right when you setup that's your own fault. Buy a Dell server with ECC. Don't install any software at all. Come ECC error, you'll have an orange blinky light immediately warning you of impending doom, and it'll even tell you on the front display details of the fault, including which DIMM needs replacing. If you can be bothered to install OMSA (run one command, one yum install), it'll drop you an email when if fails. Compared with not running with ECC, you wait until your machine randomly reboots. You ponder whether it's RAM/CPU/Motherboard. You just ignore it. It does it again. You then have a fun game of running memtest while pulling DIMMs out to try to work out which of the 16 are causing the issue. Joy unbounded. And what do you mean about shortage of space for upgrades? What that has to do with ECC I'll have no idea. > Pleae, name a single instance in the last 10 years where ECC > demonstrably saved you work, especially if you made sure ti burn in > the ssytem components on servers upon their first bootup... I've had plenty of HPC nodes that have warned of corrected memory errors. I've been able to drop them out of the queues, get the memory fixed, and put them back into service without anyone noticing. Without ECC, I've potentially introduced errors into their results, and you're much more likely to get the first random reboot without warning, costing them time. I've had memory errors creep in after 4 years, it's not something that always bites at the beginning. Equally I've had file servers do the same. Running a file server without ECC is a recipe for disaster, as you're risking silent data corruption. jh _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos