Re: [RFC] MD: Allow restarting an interrupted incremental recovery.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 17 Oct 2011 10:26:44 -0700 (PDT) Andrei Warkentin
<awarkentin@xxxxxxxxxx> wrote:

> Hi Neil,

> > If you look at it that way, you might notice that saved_raid_disk is
> > also set
> > in slot_store, so probably InIncremental should be set there.  So
> > that might
> > be the one thing you missed.
> >
> 
> Actually, maybe you can clarify something a bit about that code? The part where
> slot != -1 and mddev->pers != NULL looks a lot like the add_new_disk path - except:
> 
> After pers->hot_add_disk:
> 1) rdev In_sync isn't cleared

That looks like a bug.  I don't think I've ever tested re-adding a device by
setting 'slot' - only adding.  And for adding, In_sync is already clear.
Thanks for spotting that.

> 2) Array isn't checked for degraded state and MD_RECOVERY_RECOVER isn't conditionally set.

I think that isn't really needed in add_new_disk.
md_check_recovery will set it before recovery starts if there are spares in
the array.
So I'm inclined to just remove it from add_new_disk, but I feel that I need
to read the code a bit more carefully first.

In any case, the next point makes the omission harmless even if
MD_RECOVERY_RECOVER is needed there for some reason.


> 3) MD_RECOVERY_NEEDED isn't set.


That might be deliberate.  It allows a few drives to be added, then recovery
started by writing "recover" to sync_action.
The same could be achieved by writing "frozen" to sync_action first but I
probably wrote this code before I added "frozen".


> 4) mddev->thread is't woken up.

That goes with 3.  we only wake up the thread so that it notices
MD_RECOVERY_NEEDED.


> 
> Is this because if an array was degraded AND there were extra spares, they would already be
> assigned to the array?

Probably, yes.

Thinking a bit more, you would only get to set 'slot' on a spare in an active
array if you had first 'frozen' sync_action .. so some of my comments above
might be a bit wrong.
And given that you have frozen sync_action, you really don't want to set
MD_RECOVERY_NEEDED or start the thread.


Thanks,
NeilBrown



>  
> > Could you respin the patch without adding InIncremental, and testing
> >   rdev->saved_raid_disk >= 0
> > instead, check if you agree that should work, and perform a similar
> > test?
> > (Is that asking too much?).
> > 
> 
> Absolutely! Thanks again!
> 
> A

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux