Re: Is ECC memory any use?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2007-12-09 at 04:45 -0500, Chris Snook wrote:
> Timothy Murphy wrote:
> > I'm getting memory for a very old (P2B-LS) Asus motherboard,
> > and I see I can get ECC memory for some 20% more.
> > 
> > Is there any point in getting this?
> > I see there is quite a lot of work
> > in getting ECC testing incorporated into the Linux kernel.
> > But even if it were there, would it be very valuable?
> > 
> > I have a feeling that disk errors are far more likely
> > than RAM errors.
> > Is that right?
> > 
> > 
> 
> Depends who's buying.  Few people do anything on "personal" systems that really 
> justifies ECC RAM, though I'm sure the exceptions are probably on this list.  If 
> you're doing any kind of business work where uptime is important, or any kind of 
> technical work where bit flips could cause nasty side effects, it's probably 
> worth buying the ECC, unless you're doing high-end graphics where a stray pixel 
> won't make a difference and most of your power budget is going to the GPUs.
> 
> 	-- Chris
> 
You are assuming that only data resides in the memory, which is not the
case.  Your program will act quite strangely if bits flip as well.  ECC
has been dropped by some manufacturers because it is cheaper.  Some
other forms of systems have an alternative scheme, but I prefer ECC.
That said, my current system doesn't have it because I misread the spec.
Bits do not typically "flip occasionally".  As memory ages, a "feature"
of CMOS is something that creates bridges internal to the silicon.  This
in turn causes failures.  If you are lucky, the bit that fails will
match what is written to it.  In which case no failure shows up.  On the
other hand a bit that has a bridge can change state, and if you wrote a
0 to a bit that would bridge to a 1, then you will find that a program
can run for a while, then FLIP! and it doesn't work.  You will get an
error message that will probably not say anything about a bit flipping,
and worse, it will not be repeatable because programs are dynamically
loaded, data is paged in and out, and memory is occasionally shifted
(depending on the OS means of optimizing memory access, and the language
or OS means of garbage collection).  ECC memory means your programs will
work because your system can recover from these "single bit errors".
Note that very few ECC systems will correct multibit errors.

	If you have too many failures in code running, with strange and non
repeatable conditions, you should begin to suspect memory errors,
whether or not you have ECC memory.

Regards,
Les Howell

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora Magazine]     [Fedora News]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [SSH]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux