On Sun, May 30, 2010 at 6:37 PM, Neil Brown <neilb@xxxxxxx> wrote: > On Wed, 26 May 2010 17:50:52 -0700 > Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > >> Hi Neil, >> >> A collection of updates that have been separated onto individual topic >> branches. I provide a url, summary, and full diff for each topic if you >> want to do piecemeal pulls, otherwise the merged set is available here: >> >> git://github.com/djbw/mdadm.git master >> >> Dan Williams (10): >> mdmon: fix missing open of md/<dev>/recovery_start >> mdmon: periodically checkpoint recovery > > I don't understand why you write 'idle' to 'sync_action' here. > It should not be needed. > Shouldn't we just be listening for events on sync_completed, and write > them to the metadata?? This was an attempt to re-use the same mechanism for intermediate checkpointing as array-shutdown checkpointing. We currently checkpoint from md/resync_start and md/dev/recovery_start at sync_action == idle, but I will append a fix up patch to use ->last_checkpoint whenever is_resync_complete() is false. By the way... listening for sync_completed events will fix a bug with handling the write-pending event. I have hit occasions, and others have too [1], where the system locks up while resyncing. I believe it is because the wakeup of the blocked thread in md_write_start() races with the sync thread doing a notification for a checkpoint. I'll try to verify that the lockup goes away when polling sync_completed, but I'm fairly certain that is the bug. > >> imsm: dump each disk's view of the slot state >> Kill subarray > > I assume that this is only allowed on an idle container because the > volume index number of subsequent volumes will change, and as that is used in > the uuid the uuid will change, and that is bad. Yes. > 1/ Can you leave a 'place-holder' empty volume in the list, that a subsequent > create can use? No, this would confuse the option-rom. > 2/ As ddf doesn't suffer from this issue, the test for 'active volume that > will get a new uuid' should be in the super-intel code. And it should > be exactly that test - if earlier volumes are active, that should not be a > problem. Ok, this also needs communication with mdmon for the active case, I'll add that. > Maybe a new method "safe_to_delete_volume".... or something. I think we can just fail the delete in the handler. It also appears like we will be getting a unique identifier in a future update to the metadata. It will have a flag so that we can safely decide "use old synthesized value" versus "use new stable/recorded value". > Also: -ENODOC. Yeah, this was a rewrite of an earlier attempt so wanted to make sure it was going in the right direction. Will add. > >> Rename subarray > > Rename will change the uuid of the renamed array, but not the uuid of > anything else, so it should be allowed if just the volume is inactive. Like above need to close the loop with mdmon. > > Also it would be nicer if you factored out open_subarray in the previous > patch, but that isn't a big issue (I would still accept if that was the only > issue). Ok, I'll keep that in mind for next time. > > >> Incremental: honor an 'enough' flag from external handlers >> Revert "Incremental: honor --no-degraded to delay assembly" >> Merge branch 'subarray' into for-neil >> imsm: robustify recovery-start detection > > "robistify" !! Is that a neologism ?? ;-) right up there with "borkage". > I've merged and pushed out the other bits which all seem OK. Ok, there was one more you didn't comment on and didn't cherry-pick [2] Dave Jiang (1): create: Check with OROM limit before setting default chunk size Thanks, Dan [1]: https://bugzilla.redhat.com/show_bug.cgi?id=602457 [2]: http://git.kernel.org/?p=linux/kernel/git/djbw/mdadm.git;a=commitdiff;h=f8cde132 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html