On Sat, Aug 10, 2013 at 5:16 AM, Jeff King <peff@xxxxxxxx> wrote: > Another solution could involve not writing the duplicate of Y in the > first place. The reason we do not store thin-packs on disk is that you > run into problems with cycles in the delta graph (e.g., A deltas against > B, which deltas against C, which deltas against A; at one point you had > a full copy of one object which let you create the cycle, but you later > deleted it as redundant with the delta, and now you cannot reconstruct > any of the objects). > > You could possibly solve this with cycle detection, though it would be > complicated (you need to do it not just when getting rid of objects, but > when sending a pack, to make sure you don't send a cycle of deltas that > the other end cannot use). You _might_ be able to get by with a kind of > "two-level" hack: consider your main pack as "group A" and newly pushed > packs as "group B". Allow storing thin deltas on disk from group B > against group A, but never the reverse (nor within group B). That makes > sure you don't have cycles, and it eliminates even more I/O than any > repacking solution (because you never write the extra copy of Y to disk > in the first place). But I can think of two problems: > > 1. You still want to repack more often than every 300 packs, because > having many packs cost both in space, but also in object lookup > time (we can do a log(N) search through each pack index, but have > to search linearly through the set of indices). > > 2. As you accumulate group B packs with new objects, the deltas that > people send will tend to be against objects in group B. They are > closer to the tip of history, and therefore make better deltas for > history built on top. > > That's all just off the top of my head. There are probably other flaws, > too, as I haven't considered it too hard. Some refinements on this idea - We could keep packs in group B ordered as the packs come in. The new pack can depend on the previous ones. - A group index in addition to separate index for each pack would solve linear search object lookup problem. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html