Re: Raid 5 - not clean and then a failure.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jon Hardcastle <jd_hardcastle@xxxxxxxxx> writes:

> --- On Wed, 26/8/09, Goswin von Brederlow <goswin-v-b@xxxxxx> wrote:
>
>> From: Goswin von Brederlow <goswin-v-b@xxxxxx>
>> Subject: Re: Raid 5 - not clean and then a failure.
>> To: Jon@xxxxxxxxxxxxxxx
>> Cc: linux-raid@xxxxxxxxxxxxxxx
>> Date: Wednesday, 26 August, 2009, 12:18 PM
>> Jon Hardcastle <jd_hardcastle@xxxxxxxxx>
>> writes:
>> 
>> > Guys,
>> >
>> > I have been having some problems with my arrays that I
>> think i have nailed down to a pci controller (well I say
>> that - it is always the drives connected to *a* controller
>> but I have tried 2!) anyway the latest saga is i was trying
>> some new kernel options last night - which didn't work.
>> >
>> > But when i booted up again this morning it said one of
>> the drives was in an inconsistent state (not sure of the
>> *exact* error message). I then kicked off an add of the
>> drive and it started syncing. It got about 5% in and then
>> the second drive in on that controller complained and the
>> array failed. 
>> >
>> > Is there any hope for my data? If i get a good
>> controller in there will the resync continue? can I try and
>> tell it to assume the drives are good (which they ought to
>> be)?
>> >
>> > Please help!
>> 
>> The inconsistency is probably just a block here or there
>> and I'm
>> assuming none of your drives actualy failed. So 99.9999% of
>> your data
>> should be there. Just rebooting might actualy just get your
>> raid back
>> (to syncing). If not then you have to force reassembly from
>> the drives
>> with the newest serials. That will give you some data
>> corruption,
>> whatever was writing when the controler gave errors. Worst
>> case you
>> have to recreate the raid with --assume-clean.
>> 
>> I recommend adding a bitmap to the raid. That way a
>> wrongfully failed
>> drive can be resynced in a matter of minutes instead of
>> hours or
>> days. Makes it way less likely another error occurs during
>> resync.
>> 
>> MfG
>>         Goswin
>> --
>> To unsubscribe from this list: send the line "unsubscribe
>> linux-raid" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>
> I did look into bitmaps *abit* i could easily have the imagine for my 6 drive raid 5 stored on the raid1 I have in the same system.. The googling I did tho did not paint a pretty picture it talked about huge performance hits?

That depends on the bitmap size a lot.

It also depends on the frequency of errors. If your controler has a
hickup once a week causing a drive to fail and you need 1 day to
rebuild the array you will be left with a double disk failure pretty
quickly without bitmaps.

MfG
        Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux