Re: 1.X metadata: Resuming an interrupted incremental recovery for RAID1.

"Andrei E. Warkentin" <andrey.warkentin@xxxxxxxxx> · Wed, 12 Oct 2011 12:15:01 -0400

2011/10/12 Andrei E. Warkentin <andrey.warkentin@xxxxxxxxx>:
>> <thinks....>
>>
>> I think that if a spare being bitmap-recovered fails and then gets re-added,
>> then we have to start the bitmap-recovery from the beginning.  i.e. we cannot
>> use the recovery_offset.  This is because it is entirely possible that while
>> the device was missing the second time, a write went to an address before
>> recovery_offset and so there is a new bit, which the recovery has to handle.
>>
>> So the correct thing to do is to *not* update the metadata on the recovering
>> device until recovery completes.  Then if it fails and is re-added, it will
>> look just the same as when it was re-added the first time, and will do a
>> bitmap-based recovery.
>> Only when the bitmap-based recovery finishes should the metadata be updated
>> with a new event count, and then the bitmap can start forgetting those old
>> bits.
>>
>> Credible?
>> </thinks>
>
> I think you are spot on. If MD_FEATURE_IN_RECOVERY is set in the SB,
> the recovery_offset should be disregarded and recovery
> should start at sector zero. In fact, I've verified that in the
> *current case*, that full recovery that follows the interrupted
> incremental actually starts at
> recovery_offset, so it's broken right now.
>

Actually, doesn't this imply that recovery_offset should never be used
during a recovery (full or incremental) - after all,
you could have had I/O to the degraded array?

A
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html