memory testing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

While bad RAM is uncommon, it comes up with some regularity to cause
folks a lot of grief. I'm wondering if there's a way to make it easier
to get bad news :-\ In particular there are cases where RAM defects
just don't show up with a few hours of memtest86+, it can take days of
contiguous testing, which is so inconvenient the test itself seems
worse.

Here's what I've got so far:

1. Fedora includes /boot/memtest86+-5.01 on every installation. But
this is a legacy/BIOS program. The idea of recommending folks enable
CSM/legacy BIOS just to test their RAM is questionable because it
means disabling UEFI Secure Boot to do it. Lie in wait malware is
perhaps rare but plausible.  UEFI native memtest86+ is not free so it
can't be included. I kinda wonder if including this should be
deprecated?

2. The kernel has a built-in memory tester. Therefore it can run on
anything. But how good is it? Is it worth enabling? Should it be
enabled for all kernels or just debug kernels? The code is pretty
simple, so will it catch only the worst cases of bad RAM?
# CONFIG_MEMTEST is not set
https://elixir.bootlin.com/linux/v5.8-rc4/source/mm/memtest.c

3. "memory interface test" used at Google, Apache 2.0 license
https://github.com/stressapptest/stressapptest

4. "multiple concurrent kernel compiles" and "GCC seems to have memory
usage patterns that reliably trigger memory errors that
aren't caught by memtest"
https://lore.kernel.org/linux-btrfs/799cf552-4612-56c5-b44d-59458119e2b0@xxxxxxxxx/

Example of btrfs catching a bit flip:
https://lore.kernel.org/linux-btrfs/f42fc0d6-5dc9-dd15-9d61-53efb04fad33@xxxxxxx/
And also, this is not a good example of a memory tester. Some of the
time the corruption happens before the csum is computed so, it's not
going to catch everything.

Any other ideas how to make this better?

Thanks,
-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux