On Fri, 14 Jul 2006, David Woodhouse wrote: > On Thu, 2006-07-13 at 22:16 -0700, Linus Torvalds wrote: > > > > HEAD -> A > > / \ > > B C > > / \ \ > > D E F > > \ / / \ > > G H I > > ....... > > > > So working from your example above, and assuming that only commits I and > E actually change the files we care about. This means that merges A, B > and F are _also_ going to show up in the output of 'rev-list -- myfile'. Not necessarily. > So the slave tree will look like this: > > A' > / \ > B' F' > | | > E' I' Yes, but ONLY IF the following is true: A is different from _both_ F and B in the relevant files. If A == F (in those files), then the A merge will have been simplified away. Strictly speaking, what happens is that when it sees the merge A (which has parents B and C), and sees that _all_ the changes came from C, the simplification will decide that B simply isn't even interesting, and rewrite the merge A as having _only_ C as a parent, since C clearly explains everything that happened to those files, and B had nothing to do with it. It will then remove both A (which is no longer a merge) and C, since neither of them change the files, and will leave you with just F' | I' instead. > The interesting case, if I'm trying to convince myself that my 'slave' > tree is always going to have the correct topology, is when a merge > commit is _missing_ from the rev-list output Note that there are only two ways you can be missing a merge: - you literally asked for it with "--no-merges" - the merge had one parent that was identical to it, and the merge was simplified as above. > In that case, we accept that the representation isn't going to be > perfect -- the left-hand parent of A' is going to appear to be _either_ > D' or E', but not B'. In fact, since D' and E' are _identical_ as far as > we're concerned, it doesn't really matter which is chosen. The other one > of the two becomes an unused branch with no children -- we end up with a > graph looking like this. > > A' > / \ > D' E' F' > \/ | > I' You will never see this, because D' is simply not reachable. You can have either: - A got simplified away as a merge entirely, because C was identical, and B was thus considered "uninteresting" (as in "it not matter for the end result"), and then the later phase will always remove A too (since, by definition, for the merge to be simplified to a non-merge, it must be identical to the parent it was simplified to have) - or _both_ B and C were different to A in those files, and A still exists as a merge, but B was identical to one of its parents (let's say E), and was first simplified to "B->E->G", and then because B and E were identical, B itself was dropped, and only A' / \ E' F' | | G' I' remains. NOTE NOTE NOTE! This is how "git rev-list" (and all the other related git tools, like "git log" etc) simplify the tree. It is, in my opinion, the only sane way to do it, although you can pass in "--full-history" to say that you don't want any merge simplification at all. The reason I mention it is that _your_ simplifications may obviously do something else entirely, and you may obviously have different rules for how you simplify the tree further. But it sounds like you don't simplify the history at all (apart from the simplification that git-rev-list did for you)? Linus - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html