Re: device with newer data added as spare - data now gone?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've got myself into the babit of comparing the output from "cat /proc/mdstat"
and  "mdadm -Esbv" to see if there's any old md metadata
floating around on disks I'm about to use before using them.  Just as
a precaution.   If I find any then I --zero-superblock the disk first
before re-using it, just to prevent myself getting caught out by
events like this.

Rgds,

John


On Wed, Jul 1, 2009 at 3:43 AM, Roger Heflin<rogerheflin@xxxxxxxxx> wrote:
> Molinero wrote:
>>
>> Hi all
>>
>> I've lost quite a lot of data on my /home raid partition and I'm wondering
>> what exactly I did to make it happen. I'd like to know so something
>> similar
>> won't happen in the future.
>>
>> I'm pretty much a raid newbie. I setup raid1 on my home server and I'm
>> guessing that something like this happened. Please tell me if it's
>> possible.
>>
>> * Some time ago I did something to have one device fail which resulted md3
>> in having only 1 device.
>> * Time went by without me noticing (because I suck)
>> * An update broke my raid setup and gave me a kernel panic (because I
>> suck).
>> Didn't put the mdadm and raid hooks in mkinitcpio.conf
>> * Booted a live-cd, mounted the drives and chrooted back into the system
>> and
>> fixed the mkinitcpio.conf
>> * Rebooted and noticed that md3 was running with only 1 device
>> * Added sdb4 to md3 and it then read 1 device with 1 spare
>> * cat /proc/mdstat started to say "recovery"
>> * All data from approx. 1 year is gone
>>
>> I guessing that the old (not updated) device was set as "master" and the
>> data on the drive (containing newer data) was overwritten by data on the
>> old
>> device - is this plausible?
>
> If the old device was brought up as md3 and had dropped out months ago, the
> data would now be the data that existed when that disk dropped off.   And
> when a device drops out, there is no mark on that device marking it as bad
> since the typical reasons for the device dropping off are that it is not
> longer talking.    And sometimes mirrors are intentionally broken for
> various reasons to preserve a copy for one reason or another such as to be
> able to back out of a serious OS upgrade that did not go well quickly.
>
> If you added the current device as a spare it would have copied the data
> from the old device over the current device.
>
> That is one thing that would make 3+ disk raid5 a bit more resistant to
> this, with a dropped off disk you could not start the array with only the
> dropped device, and with all 3, 2 of the devices will know the 3rd was
> dropped at some time in the past, and with any 2 on of those devices would
> believe the other one was marked bad.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux