Re: [PATCH] git exproll: steps to tackle gc aggression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Martin Fick wrote:
> So, it has me wondering if there isn't a more accurate way
> to estimate the new packfile without wasting a ton of time?

I'm not sure there is. Adding the sizes of individual packs can be off
by a lot, because your deltification will be more effective if you
have more data to slide windows over and compress. For the purposes of
illustration, take a simple example:

packfile-1 has a 30M Makefile and several tiny deltas. Total = 40M.
packfile-2 has a 31M Makefile.um and several tiny deltas. Total = 40M.

Now, what is the size of packfile-3 which contains the contents of
both packfile-1 and packfile-2? 80M is a bad estimate, because you can
store deltas against just one Makefile.

So, unless you do an in-depth analysis of the objects in the packfiles
(which can be terribly expensive), I don't see how you can arrive at a
better estimate.

> If not, one approach which might be worth experimenting with
> is to just assume that new packfiles have size 0!  Then just
> consolidate them with any other packfile which is ready for
> consolidation, or if none are ready, with the smallest
> packfile.  I would not be surprised to see this work on
> average better than the current summation,

That is assuming that all fetches (or pushes) are small, which is
probably a good rule; you might like to have a "smallness threshold",
although I haven't thought hard about the problem.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]