Re: Suggestion for hot-replace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/25/12 07:37, H. Peter Anvin wrote:
I was looking at the hot-replace (want_replacement) feature, and I had a thought: it would be nice to have this in a form which *didn't* fail the incumbent drive after the operation is over, and instead turned it into a spare. This would make it much easier and safer to periodically rotate and test any hot spares in the system. The main problem with hot spares is that you don't actually know if they work properly until there is a failover...

    -hpa


Sorry I don't agree.

Firstly, it causes confusion. If you want a replacement in 90% of cases it means that the current drive is defective. If you put the replaced drive into the spare pool instead of kicking it out then you have to remember (by serial number?) which one it was to actually remove it from the system. If you forget to note it down, then you are in serious troubles, because if that "spare" then gets caught in another (or the same) array needing a recovery, you will have a high probability of exotic and unexpected multiple failures situations.

Also, if you are uncertain of the health of your spares, risking your array by throwing one into the array is definitely unwise. There are other tecniques to test a spare that don't involve risking you array on it: you can remove one spare from the spare pool (best if you have 2+ spares but can also be done with 1), read/write all of it various times as a validation, then re-add it back to the spares pool. Even just reading it from beginning to end with dd could be enough and for this you don't even have to remove it from the spare pool. And this doesn't degrade the array performances, while your suggestion would.

Thirdly, if you really want that (imho unwise) behaviour, it's easy to implement from userspace without asing the MD developers to do so: monitor the replacement process, as soon as you see it terminating and you see the target drive in Failed status, remove and re-add it back as a spare. That's it.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux