Re: Two degraded mirror segments recombined out of sync for massive data loss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 7, 2010 at 1:45 PM, Phillip Susi <psusi@xxxxxxxxxx> wrote:
> The gist of the problem is this: after booting a mirror in degraded mode
> with only the first disk, then doing the same with only the second disk,
> then booting with both disks again, mdadm happily recombines the two
> disks out of sync, causing two divergent filesystems to become munged
> together.
>
> The problem was initially discovered testing the coming lucid release of
> Ubuntu doing clean installs in a virtualization environment, and I have
> reproduced it manually activating and deactivating the array built out
> of two lvm logical volumes under Karmic.  What seems to be happening is
> that when you activate in degraded mode ( mdadm --assemble --run ), the
> metadata on the first disk is changed to indicate that the second disk
> was faulty and removed.  When you activate with only the second disk,
> you would think it would say the first disk was faulty, removed, but for
> some reason it ends up only marking it as removed, but not faulty.  Now
> both disks are degraded.
>
> When mdadm --incrmental is run by udev on the first disk, it happily
> activates it since the array is degraded, but has one out of one active
> member present, with the second member faulty,removed.  When mdadm
> --incremental is run by udev on the second disk, it happily slips the
> disk into the active array, WITHOUT SYNCING.
>
> My two questions are:
>
> 1) When doing mdadm --assemble --run with only the second disk present,
> shouldn't it mark the first disk as faulty, removed instead of only removed?
>
> 2) When mdadm --incremental is run on the second disk, shouldn't it
> refuse to use it since the array says the second disk is faulty, removed?
>
> The bug report related to this can be found at:
>
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/557429
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

It sounds like the last 'synced' time should be tracked, as well as
the last modification time. If the two differ then it can be known
that the contents has diverged since last sync.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux