Re: [PATCH 6/6] Teach core object handling functions about gitlinks

Martin Waitz <tali@xxxxxxxxxxxxxx> · Thu, 12 Apr 2007 01:54:50 +0200

hoi :)

On Wed, Apr 11, 2007 at 08:16:10AM -0700, Linus Torvalds wrote:
> Branches in submodules actually in many ways are *more* important than 
> branches in supermodules - it's just that with the CVS mentality, you 
> would never actually see that, because CVS obviously doesn't really 
> support such a notion.

I fully agree with you about the importance of submodule branches.
In fact, I want to make them even more important and useable!

And by the way, I long forgot about CVS ;-)

> So I'd argue that branches in submodules give you:
> 
>  - you can develop the submodule *independently* of the supermodule, but 
>    still be able to easily merge back and forth.
> 
>    Quite often, the submodule would be developed entirely _outside_ of the 
>    supermodule, and the "branch" that gets the most development would thus
>    actually be the "vendor branch", entirely outside the supermodule. Call 
>    that the "main" branch or whatever, inside the supermodule it would 
>    often be something like the remote "remotes/origin/master" branch.
> 
>    So inside the supermodule, the HEAD would generally point to something 
>    that is *not* necessarily the "main development" branch, because the 
>    supermodule maintainer would quite logically and often have his own 
>    modifications to the original project on that branch. It migth be a 
>    detached branch, or just a local branch inside the submodule.

I fully agree.

>  - branches inside submodules are *also* very useful even inside the 
>    supermodule, ie they again allow topic work to be fetched into the
>    submodule *without* having to actually be part of the supermodule,
>    or as a way to track a certain experimental branch of the supermodule.
> 
>    I suspect that most supermodule usage is as an "integrator" branch, 
>    which means that the supermodule tends to follow the "main 
>    development", and the whole point of the supermodule is largely to have 
>    a collection of "stable things that work together". 
> 
>    In contrast, branches within submodules are useful for doing all the 
>    development that is *not* yet ready to be committed to the supermodule, 
>    exactly because it's not yet been tested in the full "make World" kind 
>    of situation.

I fully agree.
You are just so much better in describing things than I am...

> > Whenever you do a checkout in the supermodule you also have to update
> > the submodule and this update has to change the same thing which is read
> > above.
> 
> I suspect (but will not guarantee) that the right approach is that a 
> supermodule checkout usually just uses a "detached HEAD" setup. Within the 
> context of the supermodule, only the actual commit SHA1 matters, not what 
> branch it was developed on (side note: I haven't decided if we should 
> allow the SHA1 to be a signed tag object too - the current patches 
> obviously don't care since they never follow the SHA1 anyway, and it might 
> be a good idea).

If you use a detached HEAD then you can no longer switch back to it
once you used some other (independent) branch (for testing or whatever).
This is my main argument: If you just update some 'special'
refs/heads/from-supermodule (or whatever, maybe get it from
.gitmodules/config) you can still switch between branches, making them
more useful IMHO.

If we create some other way to easily get to the commit referenced by
the index of the supermodule then a detached HEAD is ok for me, too.
But why create two things (this not-yet-existing way to get the
supermodule index entry, plus submodules HEAD) for the same thing?
Why not simply create a new refs/heads/whatever?
This is easy and everybody knows how to work with it.

> So I strongly suspect (and that is what the patch series embodies) that as 
> far as the supermodule is concerned, it should *not* matter at all what 
> branch the subproject was on. The subproject can use branches for 
> development, and the supermodule really doesn't care what the local 
> branchname was when a commit was made - because branch-names are *local* 
> things, and a branch that is called "experimental" in one environment 
> might be called "master" in another.

Fully agree.

Please don't confuse my "I always want to use one dedicated branch" with
"I always want to use one special branch from the submodule project".
This refs/heads/whatever I am talking about is _purely_ for ease of
use of the submodule inside the supermodule.  It is in no way linked
to the branchnames that are used by the submodule project.
Well, besides that you can merge back and forth between them, of course.

> So once the commit hits the superproject, the branch identities just go 
> away (only as far as the superproject is concerned, of course - the 
> subproject still stays with whatever branches it has), and the only thing 
> that matters is the commit SHA1.

Fully agree.

> > Updating the branch which HEAD points to is dangerous.
> 
> I would strongly suggest that the *superproject* never really change the 
> status of the subproject HEAD, except it updates it for "pull/reset", and 
> then it just would use whatever the subproject decided to use.
> 
> The subproject HEAD policy would be entirely under the control of the 
> subproject. If the subproject wants to use a branch to track the 
> superproject, go wild: have a real branch that is called "my-integration" 
> and make HEAD a symref to that (and thus any work in the superproject will 
> update that branch - something that is visible when you pull directly from 
> that subproject!)

So you now have this nice "my-integration" branch lying next to other
independent (not-supermodule-related) branches.
If you want to _switch_ to one of these unrelated branches you obviously
have to change HEAD, and suddenly your unrelated branches are
considered to be part of the supermodule (ok, not yet part of its
index of course, but now all supermodule operations would work on
this unrelated branch).

I want to preserve these unrelated branches and see them as a strong
feature.  Branches in submodules should be independent from the
supermodule _because_ the supermodule has no notion of which branch
is used.

> But quite often, I suspect that a subproject would just use a detached 
> HEAD. The subproject may have branches of its own, of course, but you can 
> think of HEAD as not being connected to any of it's "own" branches, but 
> simply being the "superproject branch". That's a fairly accurate picture 
> of reality, and using "detached HEAD" sounds like a very natural thing to 
> do in that situation.

Only that you loose your nice detached HEAD view once you start using
those nice branches inside your submodule.

> So I really think you can do both, and I think using HEAD inside the 
> superproject gives you exactly that flexibility - you can decide on a 
> per-subproject basis whether HEAD should track a real local branch in a 
> subproject, or whether it should be detached.
> 
> (Side note: if you do *not* use detatched HEAD, I suspect the .gitmodules 
> file could also contain the branchname to be used for the subproject 
> tracking, but I think that's a detail, and quite debatable)
> 
> > So my advice is:
> > Always read and write one dedicated branch (hardcoded "master" or
> > configurable) when the supermodule wants to access a submodule.
> 
> So the main reasons I don't think that is a good idea are:
> 
>  - it's less flexible: see above on why you might want to use a dedicated 
>    branch *or* just detached HEAD, and why you might want to choose your 
>    own name for the dedicated branch.

In terms of flexibility it is important what you can do with the
submodule.  Being able to use branches just like in a normal
repository ("switch the branch to go to an other, unrelated branch")
is a plus for me.

A detached HEAD does not give the same level of flexibility as a real
head.

>  - it's also going to be quite confusing when the superproject sees 
>    something *else* than what is actually checked out.

Well, the user explicitly expressed his intent to switch to another
branch!  In a normal repository you are not confused about the working
directory not being in sync with "master", and we always prominently state
which branch you are on.  Of course this has to be clear for submodules,
too.  So if you do git-status in the supermodule it should print some
"submodule is on different branch"-dirty marker.

At least I had some situations where I wanted to use something like
this: use some experimental brach which should not be directly touched
by the supermodule.  Instead provide a method ("git merge
from-supermodule") to sync your working branch with new stuff from
the supermodule.

>    This is an equally strong argument for just using HEAD - when we
>    actually implement a
> 
> 	 git diff --subproject
> 
>    flag that recurses into the subproject, if you don't use HEAD inside 
>    the subproject, that suddenly becomes a *very* confusing thing.

This is right.  Suddenly we have one more player in the field which
you can diff against.

Before submodules:
tree <-> index <-> working file

submodules always using HEAD:
tree <-> index <-> submodule HEAD <-> submodule working dir

submodules using some dedicated branch:
tree <-> index <-> subm. "from-supermodule" <-> subm. HEAD <-> subm. wd

I haven't thought about which diff really makes sense in which
situation.

-- 
Martin Waitz
Attachment:
signature.asc

Description: Digital signature