Re: How to avoid complete rebuild of RAID 6 array (6/8 active devices)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

David Greaves:
> I've found that once a disk starts to go bad there is a very strong
> tendency for it to continue to deteriorate.
> 
In my experience, that's true for older disks, but not necessarily for
those that are new and simply have a spot or two where the magnetizable
layer is a wee bit too thin.

However, even if they do in fact continue to deteriorate, the ability to
re-map the offending areas and continue gives me an order of magnitude
more time to deal with the problem.

In fact, as I said, there may be problems lurking on other disks which I
just haven't found yet (how often do you read all 5TB of your data?),
which means that a feature like this is the difference between being
able to recover and certain data loss, RAID-6 nonwithstanding.


NB, one other problem I've observed (older kernel, I don't know if it's
been fixed) is that a resync is restarted from the beginning when a
fault on a second disk is encountered. BAD idea.


NB2, my ideal RAID recovery scenario looks like this:
* When a disk access fails, the offender is switched to write-only mode.
  I.e., the kernel ignores it when reading, but still tries to write
  correct data when something's updated.
* In order to re-sync a new disk, simply duplicate the old one if it
  hasn't been removed yet; of course, you need to do "real" recovery for
  the bad spots, and you need the aforementioned write-only code to
  update both (when writing to the area that's already synced up).

The _huge_ advantage of this process would be that a re-sync does not
affect the array's read performance at all (other than the higher CPU
usage). For some people, that can be quite important.

Now where can I get the largish chunk of time required to implement all
of this ... oh well.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@xxxxxxxxxxxxxx
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
 - -
The way to a man's heart is through the left ventricle.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux