Jeff King <peff@xxxxxxxx> writes: > On Tue, Feb 27, 2007 at 12:16:42AM -0800, Junio C Hamano wrote: > >> But as Shawn pointed out, Octopus makes bisect less (much less) >> efficient for the end users, I tend to think the current 16 is >> already insanely large. > > Did you look at my "why I need a huge octopus" description? Is there a > better way to do it? Should I simply do a bunch of pair-wise merges? > I'll almost certainly never bisect it,... I hate having to compose this message because I know I will end up saying negative things without offering anything constructive. I do not think bundling commits from unrelated multiple projects in one commit (some people seem to have called this Hydra in the past) is a good practice, regardless of size. For the sake of simplicity, suppose you are bundling two projects A and B. The first such commit would have two parent commits (the current tips of A and B). Next time you create another Hydra, what will be its parents? * You do not care about the ancestry of Hydra itself, so it has two parents, then-current tips of A and B? * You do care about the ancestry of Hydra, so the first parent is the previous Hydra commit, the second parent is the then-current tip of A and the third parent is B? If you do the former, then I do not think people can follow your progress unless they have access to your reflog, so I am guessing that you are doing the latter. Now, do you have some files that are maintained by Hydra itself? Duct tape to hold these projects together, perhaps a Makefile to build the whole thing that does not belong to either A or B? I am also guessing the answer is yes, but you said you won't bisect it, so maybe this is not an issue. But let's pretend you have something that you care about their evolution history in the Hydra itself. Then, perhaps you would need to merge the ancestry of Hydras from time to time, if you have multiple concurrent development tracks of the bundled project. That means we cannot say the first parent is from Hydra itself and the rest are component projects anymore (well, we cannot say that for the initial Hydra commit itself already, but we could always special case the "root" commit). Perhaps we could say "the last N are components", but then it is not clear what happens when you add a new component. What bothers me is that in the usual commit all parents are equal, but in this case, you have different kinds of "parent" commits and from the structure of the ancestry chain, you cannot tell which is what kind. Ancestry chain of some "parent" commits represent how the bundling of components have evolved, while other "parent" commits are just pointers into different history. Although pointers to component project commits are represented as "parent" field in commit objects, I suspect that you wish they were treated as if they were tree objects contained in the toplevel commits more often than not for the purposes of many git operations. If we think about how bisect and merge _should_ work on such ancestry chain of Hydras, my gut feeling is that the only way that makes sense is to take only the first kind of ancestry (the evolution of the bundling of components) into account. Use them to determine the merge base to perform 3-way merge, count them to find the bisection point, etc. I am not saying that the problem you are trying to solve is a wrong problem. Rather, it is showing a gap between the structure you are trying to express and the semantics of ancestry chain git offers. Currently there is nothing but commit objects that can have more than one pointers to other commit objects, so if you wanted to, making an Octopus to fake it may be the only way to do so, but the current ancestry chain semantics git offers is not set up to distinguish the two different meanings of "parent" you are trying to assign to commits, so it is very likely that many things git naturally does do not match what you expect. I think git-log (without any diff options nor paths limiter) to view the linearlized sequence of commit messages is about the only thing that makes some sense, and the size limit of Octopus would probably end up to be the least of your problems. So in that sense, I would very much more prefer the solution based on "the (single) tree object contained in the top-level commit has pointers that point at commits of subprojects" approach somebody (sorry I forgot who did this) proposed in the past (well, the very original idea was Linus's "gitlink" which is probably more than a year old). Before concluding... Yes, I am aware that you do not even intend to build on top of the history of your imported-from-CVS, so in that sense, you do not care about the ancestry of Hydra itself (it does not even have a history -- just a single state). It's such a one-shot thing that we probably should not even care about (and your commit-tree patch is fine -- I think the only thing in the core git that cares about the maximum number of parents a commit can have is git-blame), but I thought I should mention that it would be an ideal application for a proper subproject support. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html