Re: [mdadm GIT PULL] rebuild checkpoints, incremental assembly, volume delete/rename, and fixes

Dan Williams <dan.j.williams@xxxxxxxxx> · Thu, 10 Jun 2010 23:42:16 -0700

On Sun, May 30, 2010 at 6:37 PM, Neil Brown <neilb@xxxxxxx> wrote:
> On Wed, 26 May 2010 17:50:52 -0700
> Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
>> Hi Neil,
>>
>> A collection of updates that have been separated onto individual topic
>> branches.  I provide a url, summary, and full diff for each topic if you
>> want to do piecemeal pulls, otherwise the merged set is available here:
>>
>>       git://github.com/djbw/mdadm.git master
>>
>> Dan Williams (10):
>>       mdmon: fix missing open of md/<dev>/recovery_start
>>       mdmon: periodically checkpoint recovery
>
> I don't understand why you write 'idle' to 'sync_action' here.
> It should not be needed.
> Shouldn't we just be listening for events on sync_completed, and write
> them to the metadata??

This was an attempt to re-use the same mechanism for intermediate
checkpointing as array-shutdown checkpointing.

We currently checkpoint from md/resync_start and md/dev/recovery_start
at sync_action == idle, but I will append a fix up patch to use
->last_checkpoint whenever is_resync_complete() is false.

By the way... listening for sync_completed events will fix a bug with
handling the write-pending event.  I have hit occasions, and others
have too [1], where the system locks up while resyncing.  I believe it
is because the wakeup of the blocked thread in md_write_start() races
with the sync thread doing a notification for a checkpoint.  I'll try
to verify that the lockup goes away when polling sync_completed, but
I'm fairly certain that is the bug.

>
>>       imsm: dump each disk's view of the slot state
>>       Kill subarray
>
> I assume that this is only allowed on an idle container because the
> volume index number of subsequent volumes will change, and as that is used in
> the uuid the uuid will change, and that is bad.

Yes.

> 1/ Can you leave a 'place-holder' empty volume in the list, that a subsequent
>   create can use?

No, this would confuse the option-rom.

> 2/ As ddf doesn't suffer from this issue, the test for 'active volume that
>   will get a new uuid' should be in the super-intel code.  And it should
>   be exactly that test - if earlier volumes are active, that should not be a
>   problem.

Ok, this also needs communication with mdmon for the active case, I'll add that.

>   Maybe a new method "safe_to_delete_volume".... or something.

I think we can just fail the delete in the handler.  It also appears
like we will be getting a unique identifier in a future update to the
metadata.  It will have a flag so that we can safely decide "use old
synthesized value" versus "use new stable/recorded value".

> Also:  -ENODOC.

Yeah, this was a rewrite of an earlier attempt so wanted to make sure
it was going in the right direction.  Will add.

>
>>       Rename subarray
>
> Rename will change the uuid of the renamed array, but not the uuid of
> anything else, so it should be allowed if just the volume is inactive.

Like above need to close the loop with mdmon.

>
> Also it would be nicer if you factored out open_subarray in the previous
> patch, but that isn't a big issue (I would still accept if that was the only
> issue).

Ok, I'll keep that in mind for next time.

>
>
>>       Incremental: honor an 'enough' flag from external handlers
>>       Revert "Incremental: honor --no-degraded to delay assembly"
>>       Merge branch 'subarray' into for-neil
>>       imsm: robustify recovery-start detection
>
> "robistify" !! Is that a neologism ??

;-) right up there with "borkage".

> I've merged and pushed out the other bits which all seem OK.

Ok, there was one more you didn't comment on and didn't cherry-pick [2]

Dave Jiang (1):
      create: Check with OROM limit before setting default chunk size

Thanks,
Dan

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=602457
[2]: http://git.kernel.org/?p=linux/kernel/git/djbw/mdadm.git;a=commitdiff;h=f8cde132
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html