Hello John, On Fri, Jun 10, 2011 at 7:25 AM, John Robinson <john.robinson@xxxxxxxxxxxxxxxx> wrote: > On 07/06/2011 09:52, John Robinson wrote: >> >> On 06/06/2011 19:06, Durval Menezes wrote: >> [...] >>> >>> It would be great to have a >>> "duplicate-this-bad-old-disk-into-this-shiny-new-disk" functionality, >>> as it would enable an almost-no-downtime disk replacement with >>> minimum risk, but it seems we can't have everything... :-0 Maybe it's >>> something for the wishlist? >> >> It's already on the wishlist, described as a hot replace. > > Actually I've been thinking about this. I think I'd rather the hot replace > functionality did a normal rebuild from the still-good drives, and only if > it came across a read error from those would it attempt to refer to the > contents of the known-to-be-failing drive (and then also attempt to repair > the read error on the supposedly-still-good drive that gave a read error, as > already happens). This looks like a very good idea. The old (failing) drive would be kept "on reserve", ready to be accessed for eventual failed sectors on the other old (good) drives... > My rationale for this is as follows: if we want to hot-replace a drive > that's known to be failing, we should trust it less than the remaining > still-good drives, and treat it with kid gloves. It may be suffering from > bit-rot. We'd rather not hit all the bad sectors on the failing drive, > because each time we do that we send the drive into 7 seconds (or more, for > cheap drives without TLER) of re-reading, plus any Linux-level re-reading > there might be. Further, making the known-to-be-failing drive work extra > hard (doing the equivalent of dd'ing from it while also still using it to > serve its contents as an array member) might make it die completely before > we've finished. I agree completely. > What will this do for rebuild time? Well, I don't think it'll be any slower. I think it will actually be faster. > On the one hand, you'd think that copying from one drive to another would be > faster than a rebuild, because you're only reading 1 drive instead of N-1, > but on the other, your array is going to run slowly (pretty much degraded > speed) anyway because you're keeping one drive in constant use reading from > it, and you risk it becoming much, much slower if you do run in to hundreds > or thousands of read errors on the failing drive. > > So overall I think hot-replace should be a normal replace with a possible > second source of data/parity. Your reasoning sounds good to me. > Thoughts? Only sadness that it's not implemented yet... :-) > Yes, I know, -ENOPATCH Exactly :-) Cheers, -- Durval Menezes. > > Cheers, > > John. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html