On Fri, 14 Jul 2006, David Woodhouse wrote: > > > > The default ordering from git-rev-list (and all other revision listing > > things, ie "git log" etc) _does_ guarantee that we never show a child > > before _one_ of its parents has been shown (although "parent" in this case > > may be the command line). > > Does it? I thought at one point it sorted on some random criterion like > alphabetically by author, or some other cosmetic information which isn't > really part of the git structure -- like the timestamp or something? > We still don't enforce monotonicity, do we? The timestamps are still > just fluff? The timestamps are, and always have been, just a heuristic. The output order of git-rev-list is actually entirely well-defined, but it's the _cheap_ ordering, not the strict and full topological one. The cheap ordering means that we don't ever look at the whole history, but it's still a real "DAG reachability ordering" in the sense that when we output a commit, we have _always_ output _one_ full path of commits to reach that commit from one of the starting point. But since you can traverse the DAG in any number of ways, the heuristic is that when there are multiple choices, we pick the one with the most recent commit date. So to give an example, let's say we have HEAD -> A / \ B C / \ \ D E F \ / / \ G H I ....... the difference between --topo-order and the default ordering for git rev-list HEAD is most visible for commit 'G'. For --topo-order, we guarantee that before we show 'G', we _will_ have shown both 'D' and 'E'. In other words, --topo-ordering guarantees that it shows _all_ children before it shows the parent. That's a _very_ very expensive thing to guarantee, because you can't actually tell that you've seen all children on 'G' before you've basically traversed most of the tree. In the above example, you CANNOT tell whether 'F' is a child of 'G', for exmaple. Think about it. You don't know - maybe the missing piece is 'I' -> 'Z' -> 'G', but without having parsed all the commits, you'll never know. [ Actually, strictly speaking, you can guarantee it earlier than before you parsed them _all_: you can guarantee it once _every_single_commit_ whose parents you haven't followed yet is a direct ancestor of 'G' - at that point, and not before, do you know that 'G' can have no more children. That's actually very expensive to compute, so we don't do it - we will walk the whole history, and only _then_ do we use one of the algorithms to generate a topological sort from the full DAG. If somebody knows of an _incremental_ algorithm that doesn't need the full DAG and can do a topo-ordering, that would be wonderful. But it's basically very very very expensive. ] So by default, we don't do that at all. By default, we will print out 'G' whenever we have printed out _any_ path leading to 'G', and 'G' is the commit with the most recent commit date. So we might print things out as A, B, D, G, E ... - notice how we printed out 'E' _after_ we did 'G', but we did have the A->B->D->G path, so G was reachable from the top along the path we printed. > In this case, I really do have commits in the intermediate tree which > don't actually change anything, and I want to filter them out -- I > couldn't see a simple way to do it all in one pass. Ok, in that case, the "." is correct, but the --topo-order should be unnecessary because you only care about the first entry. Linus - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html