From: Nick Townsend <nick.townsend@xxxxxxx> Subject: Re: [PATCH] submodule recursion in git-archive Date: 2 December 2013 15:55:36 GMT-8 To: Heiko Voigt <hvoigt@xxxxxxxxxx> Cc: Junio C Hamano <gitster@xxxxxxxxx>, René Scharfe <l.s.r@xxxxxx>, Jens Lehmann <Jens.Lehmann@xxxxxx>, git@xxxxxxxxxxxxxxx, Jeff King <peff@xxxxxxxx> On 29 Nov 2013, at 14:38, Heiko Voigt <hvoigt@xxxxxxxxxx> wrote: > On Wed, Nov 27, 2013 at 11:43:44AM -0800, Junio C Hamano wrote: >> Nick Townsend <nick.townsend@xxxxxxx> writes: >>> * The .gitmodules file can be dirty (easy to flag, but should we >>> allow archive to proceed?) >> >> As we are discussing "archive", which takes a tree object from the >> top-level project that is recorded in the object database, the >> information _about_ the submodule in question should come from the >> given tree being archived. There is no reason for the .gitmodules >> file that happens to be sitting in the working tree of the top-level >> project to be involved in the decision, so its dirtyness should not >> matter, I think. If the tree being archived has a submodule whose >> name is "kernel" at path "linux/" (relative to the top-level >> project), its repository should be at .git/modules/kernel in the >> layout recent git-submodule prepares, and we should find that >> path-and-name mapping from .gitmodules recorded in that tree object >> we are archiving. The version that happens to be checked out to the >> working tree may have moved the submodule to a new path "linux-3.0/" >> and "linux-3.0/.git" may have "gitdir: .git/modules/kernel" in it, >> but when archiving a tree that has the submodule at "linux/", it >> would not help---we would not know to look at "linux-3.0/.git" to >> learn that information anyway because .gitmodules in the working >> tree would say that the submodule at path "linux-3.0/" is with name >> "kernel", and would not tell us anything about "linux/". >> >>> * Users can mess with settings both prior to git submodule init >>> and before git submodule update. >> >> I think this is irrelevant for exactly the same reason as above. >> >> What makes this tricker, however, is how to deal with an old-style >> repository, where the submodule repositories are embedded in the >> working tree that happens to be checked out. In that case, we may >> have to read .gitmodules from two places, i.e. >> >> (1) We are archiving a tree with a submodule at "linux/"; >> >> (2) We read .gitmodules from that tree and learn that the submodule >> has name "kernel"; >> >> (3) There is no ".git/modules/kernel" because the repository uses >> the old layout (if the user never was interested in this >> submodule, .git/modules/kernel may also be missing, and we >> should tell these two cases apart by checking .git/config to >> see if a corresponding entry for the "kernel" submodule exists >> there); >> >> (4) In a repository that uses the old layout, there must be the >> repository somewhere embedded in the current working tree (this >> inability to remove is why we use the new layout these days). >> We can learn where it is by looking at .gitmodules in the >> working tree---map the name "kernel" we learned earlier, and >> map it to the current path ("linux-3.0/" if you have been >> following this example so far). >> >> And in that fallback context, I would say that reading from a dirty >> (or "messed with by the user") .gitmodules is the right thing to >> do. Perhaps the user may be in the process of moving the submodule >> in his working tree with >> >> $ mv linux-3.0 linux-3.2 >> $ git config -f .gitmodules submodule.kernel.path linux-3.2 >> >> but hasn't committed the change yet. >> >>> For those reasons I deliberately decided not to reproduce the >>> above logic all by myself. >> >> As I already hinted, I agree that the "how to find the location of >> submodule repository, given a particular tree in the top-level >> project the submodule belongs to and the path to the submodule in >> question" deserves a separate thread to discuss with area experts. > > FYI, I already started to implement this lookup of submodule paths early > this year[1] but have not found the time to proceed on that yet. I am > planning to continue on that topic soonish. We need it to implement a > correct recursive fetch with clone on-demand as a basis for the future > recursive checkout. > > During the work on this I hit too many open questions. Thats why I am > currently working on a complete plan[2] so we can discuss and define how > this needs to be implemented. It is an asciidoc document which I will > send out once I am finished with it. > > Cheers Heiko > > [1] http://article.gmane.org/gmane.comp.version-control.git/217020 > [2] https://github.com/hvoigt/git/wiki/submodule-fetch-config Heiko It seems to me that the question that you are trying to solve is more complex than the problem I faced in git-archive, where we have a single commit of the top-level repository that we are chasing. Perhaps we should split the work into two pieces: a. Identifying the complete submodule configuration for a single commit, and b. the complexity of behaviour when fetching and cloning recursively (which of course requires a.) I’m very happy to work on the first, but the second seems to me to require more understanding than I currently possess. In order to do this it would help to have a place to discuss this. I see you have used the wiki of your fork of git on GitHub. Is that the right place to solicit input? Kind Regards Nick -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html