Hi Wol, -----Original Message----- From: Wols Lists [mailto:antlists@xxxxxxxxxxxxxxx] Sent: Monday, November 14, 2016 7:58 AM To: Bruce Merry Cc: linux-raid@xxxxxxxxxxxxxxx Subject: Re: What to do about Offline_Uncorrectable and Pending_Sector in RAID1 On 14/11/16 15:52, Bruce Merry wrote: > On 13 November 2016 at 23:06, Wols Lists <antlists@xxxxxxxxxxxxxxx> wrote: >> > Sounds like that drive could need replacing. I'd get a new drive >> > and do that as soon as possible - use the --replace option of mdadm >> > - don't fail the old drive and add the new. > Would you mind explaining why I should use --replace instead of taking > out the suspect drive? I guess I lose redundancy for any writes that > occur while the rebuild is happening, but I'd plan to do this with the > filesystem unmounted so there wouldn't be any writes. >Because a replace will copy from the old drive to the new, recovering any failures from the rest of the array. A fail-and-add will have to rebuild the entire new array >from what's left of the old, stressing the old array much more. >Okay, in your case, it probably won't make an awful lot of difference, but it does make you vulnerable to problems on the "good" drive. To alter your wording >slightly, you lose redundancy for writes AND READS that occur while the array is rebuilding. It's just good practice (and I point it out because --replace is new and >not well known at present). >Cheers, >Wol With respect to the --replace switch and "replacing a failed drive" documented on the wiki here: https://raid.wiki.kernel.org/index.php/Replacing_a_failed_drive Can you clear a few things up for me ? 1. If I just want to replace a working drive in a RAID1 and the array is still redundant I can issue the following command as in your example: mdadm /dev/mdN [--fail /dev/sdx1] --remove /dev/sdx1 --add /dev/sdy1 which fails and removes sdx1 and replaces it with sdy1. Question1. How is this different from first doing a fail/remove on sdx1, physically replacing sdx1 with sdy1 and doing an add on sdy1? 2. If one of the drives as an error in a RAID1 and gets kicked out of the array and the array loses redundancy the wiki has the following example: mdmad /dev/mdN --re-add /dev/sdX1 mdadm /dev/mdN --add /dev/sdY1 --replace /dev/sdX1 --with /dev/sdY1 Question2. Is this point here to first try and re-add sdX1 with the "--re-add" (first line above) and if that fails do a replace (second line above)? Thanks, Peter To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html