On Wednesday 2007, May 16, Junio C Hamano wrote: > (3) git-checkout finds there is .gitmodules file in the > tree (and the checked-out working file), which > describes these subprojects. It looks at the config > and notices that it does not yet know about them > (obviously this is true, as this is the first checkout > after clone, but I am trying to outline how checkout > after a merge should work in the general case). > > It determines where to fetch that subproject from, > perhaps it uses the default URL described in > .gitmodules file to, while asking the user for > confirmation and giving the user a chance to override > it. And it records something in the config -- now that > project is known to this repository. I've been thinking about this .gitmodules thing and have a concern. Aren't we falling into the svn:externals trap? The svn:externals property is analagous to our .gitmodules file. svn properties were basically just version controlled out-of-tree meta data (making them annoying to work with - in-tree is better). svn "submodule" support was done by writing something like subproject svn://host/blah/blah In the svn:externals property attached to the directory that the "subproject" directory was in. To translate: svn propset svn:externals "subproject svn://blah/blah/blah" . git clone git://blah/blah/blah subproject git add subproject The hole that this sort of thing gets you in to is that the svn:externals property is version controlled. Time passes since you added the external; in that time the URL becomes invalid. No problem, you simply change the svn:externals property. KABOOM. Now any historical checkout fails because it checks out the svn:externals property from that checkout and tries to use the wrong URL. Our in-tree .gitmodules will have the same problem. I recognise that you've mitigated that with some "confirm with the user, store in the config" hand waving; but that is just hiding the problem: the submodule URL is not something that should be version controlled; it is an all-of-history property; when it changes for revision N it changes for revision N-1, N-2, N-3, etc. Storing it in .gitmodules implies that it's value in the past has meaning - it doesn't. You mentioned yourself that that problem is not confined to the temporal accuracy of .gitmodules, there is spatial accuracy too - there is no guarantee that user A wants to use the same submodule URL as user B. Fast forward to when we've got submodule support; let's say you start using it for git-gui (for example). Somehow (let's leave the "how" till later) I've gotten a working git tree with a git-gui checked out. I go to my laptop and clone that repository (note: NOT the upstream repository). When git-clone hits the git-gui submodule it should not go looking for the upstream git-gui, I will want it to clone my local git-gui submodule. i.e. in-tree .gitmodules URL for git-gui will be wrong. I hope the above shows that in-tree .gitmodules is wrong; it can only ever be a hint, and in a great number of cases it will be an incorrect hint. I know it's so enticing to store it in-tree; it would be great because the normal repository object transfer mechanism would get the URL of the submodule to the receiver with no changes to current infrastructure. I say: tough luck - we need another mechanism. The submodule URL is a per-repository setting, not a per-project setting. When fetching, some out-of-band mechanism for telling the other side what URL _this_ repository thinks the submodule is at needs to be supplied. I don't know what space there is in the git protocol for putting that information, but I suspect that that is where it needs to go. As an alternative to that, the supermodule could be given the ability to proxy for the submodule during clone. It knows where the submodule is stored from it's point of view; is there scope for doing a virtual-server-like system were the supermodule git-daemon just changes to the submodule repository (in the case it is local) and thereby gives the downstream git access to the submodule without it even needing a URL. > * Perhaps add 'tree' entries in the index. This may make the > current cache-tree extension unnecessary, and I suspect it > will simplify various paths that deal with D/F conflicts in > the current codebase. > > I suspect this might need 1.6, as it is a one-way backward > incompatible change for the 'index', but 'index' is local so > it might not be such a big deal. In the worst case, when the > users find "git checkout" from 1.5.2 does not work in a > repository checked out with such an updated index format, we > could ask them to "rm -f .git/index && git checkout HEAD". I don't think even that would be necessary. Assuming that the new index format is a superset of the old index format the only way that tree entries would get in the index would be by using git-1.6. Almost by definition then, if they are in there your git is up-to-date enough to use them. (modulo me not really understanding what you mean) Andy -- Dr Andy Parkins, M Eng (hons), MIET andyparkins@xxxxxxxxx - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html