Re: [PATCH 6/8] implement metadata cache subsystem

Jeff King <peff@xxxxxxxx> · Mon, 6 Aug 2012 16:31:54 -0400

On Sat, Aug 04, 2012 at 03:49:12PM -0700, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > There are some calculations that git makes repeatedly, even
> > though the results are invariant for a certain input (e.g.,
> > the patch-id of a certain commit). We can make a space/time
> > tradeoff by caching these on disk between runs.
> >
> > Even though these may be immutable for a certain commit, we
> > don't want to directly store the results in the commit
> > objects themselves, for a few reasons:
> >
> >   1. They are not necessarily used by all algorithms, so
> >      bloating the commit object might slow down other
> >      algorithms.
> >
> >   2. Because they can be calculated from the existing
> >      commits, they are redundant with the existing
> >      information. Thus they are an implementation detail of
> >      our current algorithms, and should not be cast in stone
> >      by including them in the commit sha1.
> >
> >   3. They may only be immutable under a certain set of
> >      conditions (e.g., which grafts or replace refs we are
> >      using). Keeping the storage external means we can
> >      invalidate and regenerate the cache whenever those
> >      conditions change.
> 
> 4. The algorithm used to compute such values could improve over
> time.  The same advantage argument as 3 applies to this case.

Yeah, agreed. That commit message is a year old, and was written for an
earlier iteration of the patch which was used for caching commit
generations. There's not really a better algorithm there, but your
comment certainly applies to rename similarities.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html