Hi Peff, On Mon, 12 Nov 2012, Jeff King wrote: > On Mon, Nov 12, 2012 at 11:27:14PM +0100, Jean-Jacques Lafay wrote: > > > 2012/11/11 Jeff King <peff@xxxxxxxx>: > > > On Sun, Nov 11, 2012 at 05:46:32PM +0100, René Scharfe wrote: > > > > > > Ultimately, I have some ideas for doing this in a breadth-first way, > > > which would make it more naturally iterative. It would involve > > > having N bits of storage per commit to check N tags, but it would > > > mean that we could get accurate answers in the face of clock skew > > > (like the merge-base calculation, it would merely get slower in the > > > face of skew). > > > > I guess the optimal algorithm may also depend on the commit graph > > general shape, but intuitively, I'd say that the critical factor is > > the number and distribution of tags. As soon as you have a significant > > number of tags (let's say 1% of the commits are tagged, evenly > > distributed), you'll quickly end up with every commit marked as > > containing or not the target commit, so that each additional tag check > > is cheap. > > > > This suggests a complexity of O(number of commits) more often then > > not, however you choose to traverse the graph. > > We can do much better than O(number of commits), though, if we stop > traversing down a path when its timestamp shows that it is too old to > contain the commits we are searching for. The problem is that the > timestamps cannot always be trusted, because they are generated on > machines with wrong clocks, or by buggy software. This could be solved > by calculating and caching a "generation" number, but last time it was > discussed there was a lot of arguing and nothing got done. Sadly, not only machines with skewed clocks, but in particular buggy 3rd-party SCMs make this more than just problematic. In a git-svn clone that was used as base for heavy Git development, I encountered quite a lot of Jan 1, 1970 commits. It just cannot be helped, we must distrust timestamps completely. Ciao, Dscho