} -----Original Message----- } From: linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid- } owner@xxxxxxxxxxxxxxx] On Behalf Of Bill Davidsen } Sent: Saturday, May 09, 2009 7:08 PM } To: Goswin von Brederlow } Cc: linux-raid@xxxxxxxxxxxxxxx } Subject: Re: Requesting replace mode for changing a disk } } Goswin von Brederlow wrote: } > Hi, } > } > consider the following situation: You have a software raid that runs } > fine but one disk is suspect (e.g. SMART says failure imminent or } > something). How do you replace that disk? } > } > Currently you have do fail/remove the disk from the raid, add a } > fresh disk and resync. That leaves a large window in which redundancy } > is compromised. With current disk sizes that can be days. } > } > It would be nice if one could tell the kernel to replace a disk in a } > raid set with a spare without the need to degrade the raid. } > } > Thoughts? } > } } This is one of many things proposed occasionally here, no real } objection, sometimes loud support, but no one actually *does* the code. } } You have described the problem exactly, and the solution is still to do } it manually. But you don't need to fail the drive long term, if you can } stop the array for a few moments. You stop the array, remove the suspect } drive, create a raid1 of the suspect drive marked write-mostly and the } new spare, then add the raid1 in place of the suspect drive. For any } chunks present on the new drive the reads will go there, reducing } access, while data is copied from the old to the new in resync, and } writes still go to the old suspect drive so if the new drive fails you } are no worse off. When the raid1 is clean you stop the main array and } back the suspect drive out. } } This is complicated enough that I totally agree a hot migrate would be } desirable. This is why people use lvm, although I make zero claims that } this same problem will solve more easily, I'm just not an lvm guru (or } even a newbie, just an occasional user). If the disk is suspect, I would expect read errors! If you have 1 bad block on the suspect disk, this process will fail. If the logic was built-in to md, then any read errors while replacing could be recovered from another disk or disks. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html