Re: [RFC] Add --create-cache to repack

Junio C Hamano <gitster@xxxxxxxxx> · Sat, 29 Jan 2011 22:51:05 -0800

Shawn Pearce <spearce@xxxxxxxxxxx> writes:

> I fully implemented the reuse of a cached pack behind a thin pack idea
> I was trying to describe in this thread.  It saved 1m7s off the JGit
> running time, but increased the data transfer by 25 MiB.  I didn't
> expect this much of an increase, I honestly expected the thin pack
> portion to be well, thinner.  The issue is the thin pack cannot delta
> against all of the history, its only delta compressing against the tip
> of the cached pack.  So long-lived side branches that forked off an
> older part of the history aren't delta compressing well, or at all,
> and that is significantly bloating the thin pack.  (Its also why that
> "newer" pack is 57M, but should be 14M if correctly combined with the
> cached pack.)  If I were to consider all of the objects in the cached
> pack as potential delta base candidates for the thin pack, the entire
> benefit of the cached pack disappears.

What if you instead use the cached pack this way?

 0. You perform the proposed pre-traversal until you hit the tip of cached
    pack(s), and realize that you will end up sending everything.

 1. Instead of sending the new part of the history first and then sending
    the cached pack(s), you send the contents of cached pack(s), but also
    note what objects you sent;

 2. Then you send the new part of the history, taking full advantage of
    what you have already sent, perhaps doing only half of the reuse-delta
    logic (i.e. you reuse what you can reuse, but you do _not_ punt on an
    object that is not a delta in an existing pack).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html