Erik Mouw wrote:
On Wed, Jan 05, 2005 at 02:56:30PM +0400, Brad Campbell wrote:
I beg to differ on this one. Having spend several weeks tracking down
random processes dying on a machine that turned out to be a bad sector in
the swap partition, I have had great results by running swap on a RAID-1.
If you develop a bad sector in a non-mirrored swap, bad things happen
indeterminately and can be a royal PITA to chase down. It's just a little
extra piece of mind.
If you have a bad block in your swap partition and the device doesn't
report an error about it, no amount of RAID is going to help you
against it.
The drive IS reporting read errors in most cases. But that does not
help, really: kernel swapped out some memory but can't read it back,
so things are screwed. Just like if you hot-remove a DIMM while the
system is running: the kernel loses parts of it's memory, and it can't
work anymore. Depending on what was in there ofcourse: the whole
system may be screwed, or a single process... The talks isn't about
"undetectable" (unreported etc) errors here, but about the fact that
the error is here. And if your swap is on raid, in case one component
of the array behaves badly, another component will continue to work,
so with swap on raid the system will work just fine as if nothing
happened in case one of "swap components" (i mean underlying devices)
failed for whatever reason.
And please, pretty PLEASE stop talking about those mysterious
"undetectable" or "unreported" errors here. A drive that develops
"unreported" errors just does not work and should not be here in
the first place, just like bad memory or CPU: if your cpu or memory
is failing, no software tricks helps and the failing part should
be replaced BEFORE even thinking about possible ways to recover.
/mjt
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html