Re: Maximizing failed disk replacement on a RAID5 array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Brad, Drew,

Thanks for reminding me of the hammering a RAID level conversion would  cause.
This is certainly a major  reason to avoid the RAID5->RAID6->RAID5 route.

The "repair" has been running here for a few days already, with the
server online, and ought to finish in 24 more hours. So far (thanks to
the automatic rewrite relocation) the number of  uncorrectable sectors
being reported by SMART has dropped from 40 to 20 , so it seems the
repair is  doing its job. Lets just hope the disk has enough  spare
sectors  to remap all the bad sectors; if it does, a simple "dd "from
the bad disk to  its replacement ought to  do the job  (as you have
indicated).

On the other hand, as this "dd" has to be done with the array offline,
it will entail in some downtime (although not as much as having to
restore the whole array from backups).... not ideal, but not too bad
either.

In case worst comes to worst, I have an up-to-date offline backup of
the contents of the whole array, so if something really bad happens, I
have something to restore from.

It would be great to have a
"duplicate-this-bad-old-disk-into-this-shiny-new-disk"  functionality,
as it would enable  an almost-no-downtime disk replacement with
minimum  risk, but it seems we can't have everything... :-0 Maybe it's
something for the wishlist?

About mishaps with "dd", I think everyone  who ever dealt with a
system  (not just Linux)  on the level we do has sometime gone through
something similar... the last time I remember doing this was many
years ago, before  Linux existed, when me and a few friends spent a
wonderful night installing  William Jolitz ' then-new 386/BSD  on a HD
 (a process which *required*  dd)  and trashing its Windows partitions
(which contained the only copy of the graduation thesis of one of us,
due in a few days).

Thanks for all the help,
--
   Durval Menezes.

On Mon, Jun 6, 2011 at 12:54 PM, Brad Campbell <brad@xxxxxxxxxxxxxxx> wrote:
>
> On 06/06/11 23:37, Drew wrote:
>>>
>>> Now, if I'm off the wall and missing something blindingly obvious feel free
>>> to thump me with a clue bat (it would not be the first time).
>>>
>>> I've lost 2 arrays recently. 8TB to a dodgy controller (thanks SIL), and 2TB
>>> to complete idiocy on my part, so I know the sting of lost or corrupted
>>> data.
>>
>> I think you've covered the process in more detail, including pitfalls,
>> then I have. :-) Only catch is where would you find a cheap 2-3TB
>> drive right now?
>
> I bought 10 recently for about $90 each. It's all relative, but I consider ~$45 / TB cheap.
>
>> I also know the sting of mixing stupidity and dd. ;-) A friend was
>> helping me do some complex rework with dd on one of my disks. Being
>> the n00b I followed his instructions exactly, and him being the expert
>> (and assuming I wasn't the n00b I was back then) didn't double check
>> my work. Net result was I backed the MBR/Partition Table up using dd,
>> but did so to a partition on the drive we were working on. There may
>> have been some alcohol involved (I was in University), the revised
>> data we inserted failed, and next thing you know I'm running Partition
>> Magic (the gnu tools circa 2005 failed to detect anything) to try and
>> recover the partition table. No backups obviously. ;-)
>
> Similar to my
>
> dd if=/dev/zero of=/dev/sdb bs=1M count=100
>
> except instead of the target disk, it was to a raid array member that was currently active. To its credit, ext3 and fsck managed to give me most of my data back, even if I had to spend months intermittently sorting/renaming inode numbers from lost+found into files and directories.
>
> I'd like to claim Alcohol as a mitigating factor (hell, it gets people off charges in our court system all the time) but unfortunately I was just stupid.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux