Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))

Jeff King <peff@xxxxxxxx> · Thu, 5 Aug 2010 15:35:45 -0400

On Thu, Aug 05, 2010 at 11:22:37AM -0700, Junio C Hamano wrote:

> > That's basically a finer-grained version of what I implemented. Mine
> > finds the _worst_ skew for the whole graph, and never lets you optimize
> > a traversal cutoff more than that skew. So it is nicely bounded
> > space-wise, as it is always a single integer, but you waste effort on
> > the entire traversal because a couple of commits are skewed. Yours
> > optimizes perfectly, but needs O(skewed commits) storage. Which is
> > probably a better tradeoff when the number of skewed commits is tiny
> > (which is what we expect).
> 
> One thing missing from the above equation is that O(skewed commits)
> approach will need O(number of commits) look-ups in the skew database (be
> it a notes tree or whatever), only to make sure that most of the look-ups
> say "no timestamp tweak required".  So I think the global single integer
> approach you took would probably be better in the overall picture.

I'm not sure it is that bad. Shouldn't it have the same number of
lookups as this scenario:

  # pretend we have some fake timestamps
  for i in 20 40 60; do
    git notes add -m "fake timestamp" HEAD~$i
  done

  # now time it without notes
  time git log --pretty=raw --no-notes >/dev/null

  # and with notes
  time git log --pretty=raw --show-notes >/dev/null

For me, the timing differences are lost in the noise. So perhaps the
lookup isn't all that expensive compared to the actual traversal.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html