Re: [RFC] Cache negative delta pairs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 29 Jun 2006, Jeff King wrote:

> > This is way suboptimal.  First there is no reason for the cache to ever 
> > grow to N^2.  At worst it should be N*10 where 10 is the current window 
> > size.
> 
> I assumed the window would change over time (though our total is still
> likely to hang around N*10 rather than N^2).

It doesn't change unless you force a different window size.

> > none of the objects found in a given window.  Therefore we only have to 
> > compute a hash for the object names found in that window and store that 
> > in the cache.  So the cache entries would then be a pair of sha1: first 
> > the sha1 of the victim object, and the sha1 of all sha1 names for the 
> > objects against which the victim object was found not to delta well 
> > against.
> 
> This will fail to hit the cache anytime the window changes. How often
> does the window change? In my test case, I would think anytime I added a
> bunch of new photos, it would be likely that one of them would make it
> into the window, thus invalidating the cache entry and forcing me to try
> against every object in the window (even though I've already tried
> 9/10).

Sure.  But on the lot how often will that happen?
This trades a bit of CPU for much smaller cache which might be worth it.

And even then, since my suggested method implies only one cache lookup 
in a much smaller cache instead of 10 lookups in a larger cache for each 
objects it might end up faster overall even if sometimes some windows 
don't match and deltas are recomputed needlessly.

> Also, is it true to say "if this object did not delta against this
> window, it will never delta?" What about interactions with the depth
> parameter?

Of course a greater depth might allow for a hit where there isn't any 
otherwise.  But changing the delta depth is not something someone does 
that often, and when the depth is changed then you better use -f with 
git-repack as well which like I said should also ignore the cache.

> > So given my GIT repository such a cache would be 7610 * 40 = 304400 
> > bytes if we stick to the full 40 bytes of sha1 to hash bad combinations.
> 
> Keep in mind that it will grow every time the window changes.

What do you mean by window change?


Nicolas
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]