On Thu, 2006-07-13 at 22:16 -0700, Linus Torvalds wrote: > So to give an example, let's say we have > > HEAD -> A > / \ > B C > / \ \ > D E F > \ / / \ > G H I > ....... > > the difference between --topo-order and the default ordering for > > git rev-list HEAD > > is most visible for commit 'G'. > > For --topo-order, we guarantee that before we show 'G', we _will_ have > shown both 'D' and 'E'. In other words, --topo-ordering guarantees that it > shows _all_ children before it shows the parent. Ah, OK. Then it should probably be fine. I'll talk myself through it... We're building a parallel graph of commits, containing a _subset_ of the commits in the master tree -- only those which touch certain files. For each 'interesting' commit X, we create a corresponding commit X' in the slave tree -- we create the corresponding tree object, and we also recursively create its parent commits -- replacing each parent in the original commit X with the slave-tree equivalent of the closest _interesting_ ancestor commit. It's that "closest interesting ancestor" which we're finding with the 'rev-list --max-count-1 -- myfile' invocation. The extract-khdrs-stage2.sh script is a simple example of this, and differs from the other script mostly in the way that it creates the _tree_ objects. So working from your example above, and assuming that only commits I and E actually change the files we care about. This means that merges A, B and F are _also_ going to show up in the output of 'rev-list -- myfile'. So the slave tree will look like this: A' / \ B' F' | | E' I' The interesting case, if I'm trying to convince myself that my 'slave' tree is always going to have the correct topology, is when a merge commit is _missing_ from the rev-list output -- for example, if commits D and E in your original tree both make the _same_ change, then I believe that the merge commit B will no longer show up, because 'myfile' is identical in B and in both of its parents. In that case, we accept that the representation isn't going to be perfect -- the left-hand parent of A' is going to appear to be _either_ D' or E', but not B'. In fact, since D' and E' are _identical_ as far as we're concerned, it doesn't really matter which is chosen. The other one of the two becomes an unused branch with no children -- we end up with a graph looking like this. A' / \ D' E' F' \/ | I' ... and the parent of D' and E' is the closest ancestor of G which actually touches the files we care about, of course. All we care about, in this case, is that the first commit listed by rev-list is _either_ D or E, and not something further down the tree. And that's obviously true from your description of the 'weak ordering', so yes -- it does look like I can drop the '--topo-order'. Thanks. (It would actually be quite nice if I _could_ find a cheap way to include commit B' in that final example, but it's such a rare case and it would be so expensive to do it that I don't think it's worth pursuing.) -- dwmw2 - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html