Re: data corruption - the nightmare continues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Would it not be possible to recognize a initial failed read on
one disk of a mirror, take the read from the other disk, and
write it to the failed disk and emit an explanation in the log?

A counter could be kept that decided when this strategy was
not working .. (ie more than X in the last Y minutes)..

If disks are reallocating bad blocks transparently that should 
be fine.. no?
-Justin

On Wed, Mar 20, 2002 at 04:31:04PM +0100, Jakob Østergaard wrote:
> On Wed, Mar 20, 2002 at 09:38:44AM -0500, Justin wrote:
> > As I recall, at least in my U_ situations, when an array
> > goes U_, the 'failed' disk is no longer addressable at all,
> > until a reboot.. but next time it happens I'll try after
> > reboot reading the entire surface before re-writing it
> > to see if that picks up any errors.
> 
> Ok, cool.
> 
> > I could see how a read would fail, until a disk was told
> > to write, then the whole surface would work again.. if this
> > is common behavior for disks would that perhaps be
> > something the raid code could recognize and work around?
> 
> To me it has been fairly common.
> 
> But what workaround would you put into the MD code ?  Just write a zero block
> to the bad sector, and "gracefully" ignore the bad block (leaving the
> filesystem with a zeroed out hole) ?   No, the correct action is to kick the
> disk (IMO).
> 
> I've been thinking about doing things like nightly "scans" of the underlying
> disks - but that kind of code is much easier done in userspace (where it
> belongs).  Then, you'd have a failed disk in the morning, which is better
> than suddenly having a failed disk in a RAID-5 and then losing the entire
> array when number two disk fails during the re-sync.
> 
> -- 
> ................................................................
> :   jakob@unthought.net   : And I see the elder races,         :
> :.........................: putrid forms of man                :
> :   Jakob Østergaard      : See him rise and claim the earth,  :
> :        OZ9ABN           : his downfall is at hand.           :
> :.........................:............{Konkhra}...............:

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux