Re: [RFC] t7410: 210 tests for various 'git submodule update' scenarios

"W. Trevor King" <wking@xxxxxxxxxx> · Wed, 16 Apr 2014 10:21:53 -0700

On Wed, Apr 16, 2014 at 02:54:48AM +0200, Johan Herland wrote:
> This is a work-in-progress to flesh out (and promote discussion about)
> the expected behaviors for all possible scenarios in which
> 'git submodule update' might be run.

This is lovely :).

> +#  - current state of submodule:
> +#     ?.?.?.1 - not yet cloned
> +#     ?.?.?.2 - cloned, detached, HEAD == gitlink
> +#     ?.?.?.3 - cloned, detached, HEAD != gitlink
> +#     ?.?.?.4 - cloned, on branch foo (exists upstream), HEAD == gitlink
> +#     ?.?.?.5 - cloned, on branch foo (exists upstream), HEAD != gitlink
> +#     ?.?.?.6 - cloned, on branch bar (MISSING upstream), HEAD == gitlink
> +#     ?.?.?.7 - cloned, on branch bar (MISSING upstream), HEAD != gitlink

The remote branches should only matter for the initial clone and
--remote updates.  Also, only the configured submodule.<name>.branch
(your first axis) should be checked; the locally checked-out submodule
branch shouldn't matter.

> +# T2: Test with submodule.<name>.url != submodule's remote.origin.url. Does
> +#     "submodule update --remote" sync with submodule.<name>.url, or with the
> +#     submodule's origin? (or with the submodule's current branch's upstream)?

All fetches should currently use the submodule's remote.origin.url.
submodule.<name>.url is only used for the initial clone (*.*.*.1), and
never referenced again.  This would change using my tightly-bound
submodule proposal [1], where a difference between
submodule.<name>.url and the submodule's @{upstream} URL would be
trigger a dirty-tree condition (for folks with tight-bind syncing
enabled) from which you couldn't update before resolving the
difference.

> +# D1: When submodule is already at right commit, checkout-mode currently does
> +#     nothing. Should it instead detach, even when no update is needed?
> +#     Affects: 1.2.1.4, 1.2.1.6, 2.2.1.4, 2.2.1.6, 3.2.1.4, 3.2.1.6

“Checkout updates always leave a detached HEAD” seems easier to
explain, so I'm leaning that way.

> +# D2: Should all/some of 1.3.*/1.4.* abort/error because we don't know what to
> +#     merge/rebase with (because .branch is unset)? Or is the current default
> +#     to origin/HEAD OK?
> +#     Affects: 1.3.*, 1.4.*

Maybe you mean 1.3.*, 1.4.*, and 1.5.* (merge, rebase, and !command)?
In all of these cases, we're integrating the current HEAD with the
gitlinked (*.*.1.*) or remote-tracking branch (*.*.2.*).  Since
submodule.<name>.branch defaults to master (and may be changed to HEAD
after a long transition period? [2,3]), I don't think we should
abort/error in those cases.

> +# D3: When submodule is already at right commit, merge/rebase-mode currently
> +#     does nothing. Should it do something else (e.g. not leave submodule
> +#     detached, or checked out on the "wrong" branch (i.e. != .branch))?
> +#     (This discussion point is related to D1, D5 and D6)

“Non-checkout updates always leave you on a branch” seems easier to
explain, but I think we'd want to distinguish between the local branch
and the remote submodule.<name>.branch [1].  Lacking that distinction,
I'd prefer to leave the checked-out branch unchanged.

> +# D4: When 'submodule update' performs a clone to populate a submodule, it
> +#     currently also creates a default branch (named after origin/HEAD) in
> +#     that submodule, EVEN WHEN THAT BRANCH WILL NEVER BE USED (e.g. because
> +#     we're in checkout-mode and submodule will always be detached). Is this
> +#     good, or should the clone performed by 'submodule update' skip the
> +#     automatic local branch creation?
> +#     Affects: 1.2.*.1, 1.3.*.1, 1.4.*.1, 1.5.*.1,
> +#              2.2.*.1, 2.3.*.1, 2.4.*.1, 2.5.*.1,
> +#              3.2.1.1, 3.3.1.1, 3.4.1.1, 3.5.1.1

“Checkout updates always leave a detached HEAD” seems easier to
explain, so I'm leaning that way.

> +# D5: When in merge/rebase-mode, and 'submodule update' actually ends up doing
> +#     a merge/rebase, that will happen on the current branch (or detached HEAD)
> +#     and NOT on the configured (or default) .branch. Is this OK? Should we
> +#     abort (or at least warn) instead? (In general, .branch seems only to
> +#     affect the submodule's HEAD when the submodule is first cloned.)
> +#     (This discussion point is related to D3 and D6)
> +#     Affects: 1.3.1.3, 1.3.1.5, 1.3.1.7, 1.3.2.>=2,
> +#              1.4.1.3, 1.4.1.5, 1.4.1.7, 1.4.2.>=2,
> +#              2.3.1.3, 2.3.1.5, 2.3.1.7, 2.3.2.2, 2.3.2.4, 2.3.2.6,
> +#              2.4.1.3, 2.4.1.5, 2.4.1.7, 2.4.2.2, 2.4.2.4, 2.4.2.6
> +#              3.3.1.3, 3.3.1.5, 3.3.1.7
> +#              3.4.1.3, 3.4.1.5, 3.4.1.7

With the --remote option that added submodule.<name>.branch (which
eventually landed with v8 of that series [4]), I initially imagined it
as the name of the local branch [5], but transitioned to imagining it
as the name of the remote-tracking branch in v5 of that series [6].
There were no major logical changes between v5 and v8.  With the v8
version that landed in Git v1.8.2, submodule.<name>.branch was clearly
the name of the remote-tracking branch, and we gave no way to
separately configure the local branch.

Recently, we decided that local branches might be important after all
[7], which lead to the partially landed v5 of my local-branch-creation
series [8], now partially reverted with d851ffb (Revert "submodule:
explicit local branch creation in module_clone", 2014-04-02).  Like v4
of that series [9], I considered the landed-and-now-reverted v5 to be
a stop-gap until we got better local-branch handling.

Anyhow, that's why submodule.<name>.branch is only important when we
interact with the remote repository (during clones and --remote
updates).  We've never landed a patch that explicitly addresses what
the local branch should be.

> +# D6: The meaning of submodule.<name>.branch is initially confusing, as it does
> +#     not really concern the submodule's local branch (except as a naming hint
> +#     when the submodule is first cloned). Instead, submodule.<name>.branch is
> +#     really about which branch in the _upstream_ submodule

Which is how gitmodules(5) explains it:

  submodule.<name>.branch
    A remote branch name for tracking updates…

> +#     submodule.<name>.url, or by the submodule's remote.origin.url?)
> +#     want to integrate with.

The submodule's remote.origin.url for everything except the initial
clone (*.*.*.1).  See my response to T2.

> …                               This is probably the more useful setting, and it
> +#     becomes obviously correct after (re-)reading gitmodules(5) and
> +#     git-config(1). However, from just reading the "update" section in
> +#     git-submodule(1) (or not even that), things are not so clear-cut. Would
> +#     submodule.<name>.upstream (or .remote-branch, or similar) be a better
> +#     name for this?

Are the docs from 23d25e4 (submodule: explicit local branch creation
in module_clone, 2014-01-26; now reverted with d851ffb, Revert
"submodule: explicit local branch creation in module_clone",
2014-04-02) clearer?  Maybe we can salvage some of those docs even
though we've reverted the actual code changes?

> +# D7: What to do when .branch refers to a branch that is missing from upstream?
> +#     Currently, when trying to clone, the clone fails (which causes 'git
> +#     submodule update --remote' to fail), but leaves the submodule in an
> +#     uninitialized state (there is a .git, but the work tree is missing).
> +#     This is probably not the behavior we want...
> +#     Affects: pre, 3.2.2.1, 3.3.2.1, 3.4.2.1, 3.5.2.1

I think we should remove the submodule's .git file after the failed clone.

Cheers,
Trevor

[1]: http://thread.gmane.org/gmane.comp.version-control.git/240336
[2]: http://thread.gmane.org/gmane.comp.version-control.git/245283
[3]: http://thread.gmane.org/gmane.comp.version-control.git/245357
[4]: http://thread.gmane.org/gmane.comp.version-control.git/211830
[5]: http://article.gmane.org/gmane.comp.version-control.git/210730
[6]: http://article.gmane.org/gmane.comp.version-control.git/210764
[7]: http://thread.gmane.org/gmane.comp.version-control.git/239799
[8]: http://thread.gmane.org/gmane.comp.version-control.git/241112
[9]: http://article.gmane.org/gmane.comp.version-control.git/240498

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
Attachment:
signature.asc

Description: OpenPGP digital signature