Re: [RFC] MD: Allow restarting an interrupted incremental recovery.

NeilBrown <neilb@xxxxxxx> · Mon, 17 Oct 2011 14:20:39 +1100

On Thu, 13 Oct 2011 21:18:43 -0400 "Andrei E. Warkentin"
<andrey.warkentin@xxxxxxxxx> wrote:

> 2011/10/12 Andrei Warkentin <andreiw@xxxxxxxxxx>:
> > If an incremental recovery was interrupted, a subsequent
> > re-add will result in a full recovery, even though an
> > incremental should be possible (seen with raid1).
> >
> > Solve this problem by not updating the superblock on the
> > recovering device until array is not degraded any longer.
> >
> > Cc: Neil Brown <neilb@xxxxxxx>
> > Signed-off-by: Andrei Warkentin <andreiw@xxxxxxxxxx>
> 
> FWIW it appears to me that this idea seems to work well, for the
> following reasons:
> 
> 1) The recovering sb is not touched until the array is not degraded
> (only for incremental sync).
> 2) The events_cleared count isn't updated in the active bitmap sb
> until array is not degraded. This implies that if the incremental was
> interrupted, recovering_sb->events is NOT less than
> active_bitmap->events_cleared).
> 3) The bitmaps (and sb) are updated on all drives at all times as it
> were before.
> 
> How I tested it:
> 1) Create RAID1 array with bitmap.
> 2) Degrade array by removing a drive.
> 3) Write a bunch of data (Gigs...)
> 4) Re-add removed drive - an incremental recovery is started.
> 5) Interrupt the incremental.
> 6) Write some more data.
> 7) MD5sum the data.
> 8) Re-add removed drive - and incremental recovery is restarted (I
> verified it starts at sec 0, just like you mentioned it should be, to
> avoid consistency issues). Verified that, indeed, only changed blocks
> (as noted by write-intent) are synced.
> 10)  Remove other half.
> 11) MD5sum data - hashes match.
> 
> Without this fix, you would of course have to deal with a full resync
> after the interrupted incremental.
> 
> Is there anything you think I'm missing here?
> 
> A

Not much, it looks good, and your testing is of course a good sign.

My only thought is whether we really need the new InIncremental flag.
You set it exactly when saved_raid_disk is set, and clear it exactly when
saved_raid_disk is cleared (set to -1).
So maybe we can just used saved_raid_disk.
If you look at it that way, you might notice that saved_raid_disk is also set
in slot_store, so probably InIncremental should be set there.  So that might
be the one thing you missed.

Could you respin the patch without adding InIncremental, and testing 
  rdev->saved_raid_disk >= 0
instead, check if you agree that should work, and perform a similar test?
(Is that asking too much?).

If you agree that works I would like to go with that version.

Thanks,
NeilBrown

Attachment:
signature.asc

Description: PGP signature