On Wed, Jul 06, 2011 at 02:54:52AM -0400, Jeff King wrote: > > From what we've seen, it seems like skewing into the past is more > common. It seems to come from importing old commits and using their > timestamps as the commit timestamps. It would be nice to find a more > accurate set (I _think_ with future skew like the second example above, > the patch below will not give wrong answers; it will just be overly > pessimal and traverse more commits than it needs to). Yes, and that was indeed my only concern. Since we cannot tell with certainty if we have skew into the past or into the future, it's not wrong to always assume skew into the past. It just does not always produce the shortest run of skewed commits, as you said. And if skews into the future are rare, then that should not be an issue. But considering the complexity behind the timestamp based approach, which you have demonstrated in your analysis, the generation number concept looks very attractive to me. It even has potential for the push/pull transport protocol. (Unreliable) commit timestamps are currently used while searching for common commits. And there is still the problem of searching down the wrong branch, which can be especially bad for repos with multiple disjoint histories. For example, we shouldn't send any HAVEs for commits with generation numbers greater than the generation number of the wanted ref. Or smaller than half that (in which case downloading the complete pack would probably be faster). Thomas, IIRC you were working on this. Do you think this could help? Clemens -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html