[PATCH 3/3] Add documentation and credits for BadRAM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Add Documentation/BadRAM.txt for in-depth information and update
Documentation/kernel-parameters.txt.

Signed-off-by: Stefan Assmann <sassmann@xxxxxxxxx>
---
 CREDITS                             |    9 +
 Documentation/BadRAM.txt            |  370 +++++++++++++++++++++++++++++++++++
 Documentation/kernel-parameters.txt |    6 +
 3 files changed, 385 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/BadRAM.txt

diff --git a/CREDITS b/CREDITS
index dca6abc..d57d4af 100644
--- a/CREDITS
+++ b/CREDITS
@@ -2899,6 +2899,15 @@ S: 6 Karen Drive
 S: Malvern, Pennsylvania 19355
 S: USA
 
+N: Rick van Rein
+E: rick@xxxxxxxxxxx
+W: http://rick.vanrein.org/
+D: Memory, the BadRAM subsystem dealing with defective RAM modules.
+S: Haarlebrink 5
+S: 7544 WP  Enschede
+S: The Netherlands
+P: 1024D/89754606  CD46 B5F2 E876 A5EE 9A85  1735 1411 A9C2 8975 4606
+
 N: Stefan Reinauer
 E: stepan@xxxxxxxx
 W: http://www.freiburg.linux.de/~stepan/
diff --git a/Documentation/BadRAM.txt b/Documentation/BadRAM.txt
new file mode 100644
index 0000000..3fb4994
--- /dev/null
+++ b/Documentation/BadRAM.txt
@@ -0,0 +1,370 @@
+INFORMATION ON USING BAD RAM MODULES
+====================================
+
+The BadRAM feature enables Linux to run on broken memory.  The
+resulting system will be stable and healthy, because the kernel
+simply never allocates the faulty pages for use.  This is how
+to setup BadRAM if your memory is failing.
+
+
+Introduction
+------------
+
+As RAM memory grows smaller, it also becomes harder to manufacture
+chips that are perfect.  Each single cell that is failing could cause
+an entire memory module to fail.  Even though manufacturers put in
+extra cells to replace failed ones, it is still possible that the
+sensitive small structures get damaged by an electric discharge on
+their pins.  Such damage leads to problems in fixed locations of
+the address space of a memory module, which is what theory predicts
+and has been confirmed by years of experience with bad memory.
+
+It is not necessary for such a memory module to be discarded.  All
+pages of memory behave the same, and if only we skip the failing
+pages we can continue to use the module for many more years.  The
+operating system kernel simply has to avoid using the blocks that
+are damaged.  This is easy to do in the part of the kernel where
+memory pages are allocated.
+
+
+Reasons for using BadRAM
+------------------------
+
+Chip manufacturing processes use lots of harsh chemicals, and the less
+of these used, the better.  Being able to make good use of partially
+failed memory chips means that far less of those chemicals are needed
+to provide storage.  This reduces expenses and it is lighter on the
+environment in which we live.
+
+This kernel feature clearly shows that Linux is "the flexible OS".
+If something does not work, fix it.  Also, share it with all the
+others that could use it.  After more than a decennium of BadRAM,
+the response has been purely positive, because it has helped real
+people to solve real problems.
+
+One important use for this feature is with laptops that have their
+memory soldered in.  Such laptops would have to be discarded as a
+whole, but with BadRAM in place they can continue to be used
+without further restrictions.
+
+Finally, running a system on broken memory is just plain cool ;-)
+
+
+Running example
+---------------
+
+To run this project, I was given two DIMMs, 32 MB each. One, that we
+shall use as a running example in this text, contained 512 faulty bits,
+spread over 1/4 of the address range in a regular pattern.  This looks
+a lot like the fauly pattern that many others have reported; the only
+common other pattern is a single faulty spot.  With such memory, a few
+tricks with a thorough RAM tester and some binary calculations suffice
+to write these fault patterns down in 2 longword numbers.  The format
+of these is hexadecimal, which is a condensed way of writing down the
+binary patterns that make the hardware patterns recognisable.
+
+After being patched and invoked with the properly formatted description,
+the kernel held back only the memory pages with faults, and never handed
+them out for allocation. The allocation routines could therefore
+progress as normally, without any adaption.  This is important, since
+all the work is done at booting time.  After booting, the kernel does
+not have to do spend any time to implement BadRAM.
+
+As a result of this initial exercise, I gained 30 MB out of the 32 MB
+DIMM that would otherwise have been thrown away.  Of course, these
+numbers scale up with larger memory modules, but the principle is
+the same.
+
+
+The structure of memory failures
+--------------------------------
+
+Memory chips are usually laid out in a roughly equal number of rows
+and columns, making it a square of cells that each store one bit.
+When addressing a bit, the processor sends the row and column in
+separate phases, and then reads or writes its value.  The rows and
+columns are therefore visible on the outside of a chip.
+
+The connections of row and column lines to the outside world is
+usually protected by a buffer.  It can happen that a static
+discharge damages such a buffer, causing an entire row or an
+entire column to fail.  This means that a series of bits become
+unusable in a single page or in a regular pattern of pages,
+depending on whether it was a row or column that got damaged.
+
+For this reason, BadRAM was designed to describe memory faults
+in a pattern of address/mask pairs.  An address locates an
+error and a zero on the corresponding position in the mask
+defines which bits in the address may be replaced with any
+other value.  This has shown to work as a tight description
+of error patterns: it is very compact, but does not waste pages
+that are good.
+
+
+BadRAM's notation for memory faults
+-----------------------------------
+
+Instead of manually providing all 512 errors in the running example
+to the kernel, it's easier to use a pattern notation. Since the
+regularity is based on address decoding software, which generally
+takes certain bits into account and ignores others, we shall
+provide a faulty address F, together with a bit mask M that
+specifies which bits must be equal to F. In C code, an address A
+is faulty if and only if
+
+	(F & M) == (A & M)
+
+or alternately (closer to a hardware implementation):
+
+	~((F ^ A) & M)
+
+In the example 32 MB chip, I had the faulty addresses in 8MB-16MB:
+
+	xxx42f4         ....0100....
+	xxx62f4         ....0110....
+	xxxc2f4         ....1100....
+	xxxe2f4         ....1110....
+
+The second column represents the alternating hex digit in binary form.
+Apparently, the first and next to last binary digit can be anything,
+so the binary mask for that part is 0101. The mask for the part after
+this is 0xfff, and the part before should select anything in the range
+8MB-16MB, or 0x00800000-0x01000000; this is done with a bitmask
+0xff80xxxx. Combining these partial masks, we get:
+
+	F=0x008042f4    M=0xff805fff
+
+That covers every fault in this DIMM; for more complicated failing
+DIMMs, or for a combination of multiple failing DIMMs, it can be
+necessary to set up a number of such F/M pairs.
+
+
+Getting started
+---------------
+
+If you experience RAM trouble, first read Documentation/memory.txt
+and try out the mem=4M trick to see if at least some initial parts
+of your RAM work well.  Note that 4 MB will not be able to hold a
+modern desktop, so if you rely on that you would have to set the
+limit higher (and accept that your sanity check is not as tight as
+possible).
+
+The BadRAM routines halt the kernel in panic if the reserved area
+of memory (containing kernel stuff) contains a faulty address.  It
+will only do that when supplied with the patterns below; this
+initial check is merely to see if this is likely to happen.
+
+
+Running a memory checker
+------------------------
+
+There is no memory checker built into the kernel, to avoid delays
+at runtime or while booting. If you experience problems that may
+be caused by RAM, run a good outside RAM checker.  The Memtest86
+checker is a popular, free, high-quality checker.  Many Linux
+distributions include it as an alternate boot option, so you may
+simply find it in your boot loader's boot menu.
+
+
+The memory checker lists all addresses that have a fault.  It will
+do this for a given configuration of the DIMMs in your motherboard;
+if you replace or move memory modules you may find other addresses.
+In the running example's 32 MB chip, with the DIMM in slot #0 on
+the motherboard, the errors were found in the 8MB-16MB range:
+
+	xxx42f4
+	xxx62f4
+	xxxc2f4
+	xxxe2f4
+
+The error reported was a "sticky 1 bit", a memory bit that always
+reads as "1" even if a "0" was just written to it.  This is
+probably caused by a damaged buffer on one of the rows or columns
+in one of the memory chips.
+
+It would be a lot of work to collect the individual errors and
+condense them into a pattern.  That is why I patched the
+Memtest86 (v2.3+) checker to directly print out the address/mask
+pairs that are used by this kernel feature. All you would do is
+select the BadRAM printout option at the start of the scan, and
+then leave it running for hours and hours, until it has made at
+least one pass.  The patterns are printed each time a bit is
+added, but each line contains all faults found up to that point,
+so you would write down the last set of patterns printed, and
+supply that as a boot option in your next run of a
+BadRAM-capable Linux kernel.
+
+If you use this patch on an x86_64 architecture, your addresses are
+twice as long.  Fill up with zeroes in the address and with f's in
+the mask.  The latter example would thus become:
+
+	mem=24M badram=0x0000000000f00000,0xfffffffffff00000
+
+The patch applies the changes to both x86 and x86_64 code bases
+at the same time.  Patching but not compiling maps the entire
+source tree at once, which makes more sense than splitting the
+patch into an x86 and x86_64 branch, because those two branches
+could not be applied at the same time because they would overlap.
+
+
+Rebooting Linux
+---------------
+
+Once the fault patterns are known we simply restart Linux with
+these F/M pairs as a parameter If your normal boot options look
+like
+
+       root=/dev/sda1 ro
+
+you should now boot with options
+
+       root=/dev/sda1 ro badram=0x008042f4,0xff805fff
+
+or perhaps by mentioning more F/M pairs in an order F0,M0,F1,M1,...
+When you provide an odd number of arguments to badram, the default
+mask 0xffffffff (meaning that only one address is matched) is
+applied to the last address.
+
+If your bootloader is GRUB, you can supply this additional
+parameter interactively during boot.  This way, you can try them
+before you edit /boot/grub/grub.conf to put them in forever.
+
+When the kernel now boots, it should not give any trouble with RAM.
+Mind you, this is under the assumption that the kernel and its data
+storage do not overlap an erroneous part. If they do, and the
+kernel does not choke on it right away, BadRAM itself will stop the
+system with a kernel panic.  When the error is that low in memory,
+you will need additional bootloader magic, to load the kernel at an
+alternative address.
+
+Now look up your memory status with
+
+	cat /proc/meminfo |grep HardwareCorrupted
+
+which prints a single line with information like
+
+HardwareCorrupted:  2048 kB
+
+The entry HardwareCorrupted: 2048k represents the loss of 2MB
+of general purpose RAM due to the errors. Or, positively rephrased,
+instead of throwing out 32MB as useless, you only throw out 2MB.
+Note that 2048 kB equals 512 pages of 4kB.  The size of a page is
+defined by the processor architecture.
+
+If the system is stable (which you can test by compiling a few
+kernels, and a few file finds in / or so) you can decide to add
+the boot parameter to /boot/grub/grub.conf, in addition to any
+other boot parameters that may already be there.  For example,
+
+	kernel /boot/vmlinuz root=/dev/sda1 ro
+
+would become
+
+	kernel /boot/vmlinuz root=/dev/sda1 ro badram=0x008042f4,0xff805fff
+
+Depending on how helpful your Linux distribution is, you may
+have to add this feature again after upgrading your kernel.  If
+your boot loader is GRUB, you can always do this manually if you
+rebooted before you remembered to make that adaption.
+
+
+BadRAM classification
+---------------------
+
+This technique might start a lively market for "dead" RAM. It is
+important to realise that some RAMs are more dead than others. So,
+instead of just providing a RAM size, it is also important to know
+the BadRAM class, which is defined as follows:
+
+	A BadRAM class N means that at most 2^N bytes have a problem,
+	and that all problems with the RAMs are persistent: They
+	are predictable and always show up.
+
+The DIMM that serves as an example here was of class 9, since 512=2^9
+errors were found. Higher classes are worse, "correct" RAM is of class
+-1 (or even less, at your choice).
+Class N also means that the bitmask for your chip (if there's just one,
+that is) counts N bits "0" and it means that (if no faults fall in the
+same page) an amount of 2^N*PAGESIZE memory is lost, in the example on
+an x86 architecture that would be 2^9*4k=2MB, which accounts for the
+initial claim of 30MB RAM gained with this DIMM.
+
+Note that this scheme has deliberately been defined to be independent
+of memory technology and of computer architecture.
+
+
+Further Possibilities
+---------------------
+
+**Slab allocation support**
+
+It would be possible to use even more of the faulty RAMs by employing
+them for slabs. The smaller allocation granularity of slabs makes it
+possible to throw out just, say, 32 bytes surrounding an error. This
+would mean that the example DIMM only caused a loss of 16kB instead
+of 2MB, or scaled-up similar values for larger memory sizes.  One
+specific area that could benefit from this is the growing market
+for embedded devices, which usually wants to meet tight budgets.
+
+It should be possible to make the slab allocator prefer pages with
+broken memory, and allocate the faulty places in memory before the
+other slabs are made available to the kernel.  In the best possible
+situation, this could reduce the loss of good RAM cells to zero!
+
+**Support for low-memory errors**
+
+To the best of my knowledge, boot loaders like GRUB cannot load
+the Linux kernel in non-standard locations.  This means that any
+errors at low memory locations cannot be overcome with BadRAM.
+
+Anything that physically alters the memory layout can be used
+to overcome such problems; this may be achieved through BIOS
+settings, or by adding or swapping memory modules.
+
+A general solution could be to use a boot loader that can load
+the Linux kernel (and its initial memory allocation) at other
+memory addresses than are standard.
+
+
+**Boot-time memory checking**
+
+Many suggestions have been made to insert a RAM checker at boot time;
+since this would leave the time to do only very meager checking, it
+is not a reasonable option; we already have a half-done BIOS check
+doing that!
+
+**ECC RAM integration**
+
+It would be interesting to integrate this functionality with the
+self-verifying nature of ECC RAM. These memories can even distinguish
+between recoverable and unrecoverable errors! Such memory has been
+handled in older operating systems by `testing' once-failed memory
+blocks for a while, by placing only (reloadable) program code in it.
+
+I possess no faulty ECC modules to work this out, and there is no
+general use for it either.
+
+
+Names and Places
+----------------
+
+The home page of this project is on
+	http://rick.vanrein.org/linux/badram
+This page also links to Nico Schmoigl's experimental extensions to
+this patch (with debugging and a few other fancy things).
+
+In case you have experiences with the BadRAM software which differ from
+the test reportings on that site, I hope you will mail me with that
+new information.
+
+The BadRAM project is an idea and implementation by
+	Rick van Rein
+	Haarlebrink 5
+	7544 WP  Enschede
+	The Netherlands
+	rick@xxxxxxxxxxx
+If you like it, a postcard would be much appreciated ;-)
+
+
+							Enjoy,
+							 -Rick.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index cc85a92..ba3e984 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -51,6 +51,7 @@ parameter is applicable:
 	FB	The frame buffer device is enabled.
 	GCOV	GCOV profiling is enabled.
 	HW	Appropriate hardware is enabled.
+	HWPOISON Handling of memory pages reported as being corrupt
 	IA-64	IA-64 architecture is enabled.
 	IMA     Integrity measurement architecture is enabled.
 	IOSCHED	More than one I/O scheduler is enabled.
@@ -373,6 +374,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 
 	autotest	[IA64]
 
+	badram=		When CONFIG_MEMORY_FAILURE is set, this parameter
+			allows memory areas to be flagged as HWPOISON.
+			Format: <addr>,<mask>[,...]
+			See Documentation/BadRAM.txt
+
 	baycom_epp=	[HW,AX25]
 			Format: <io>,<mode>
 
-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]