Jon Smirl <jonsmirl@xxxxxxxxx> wrote: > > How do you deal with dense history packs? These packs take many hours > to make (on a server class machine) and can be half the size of a > regular pack. Shouldn't there be a way to copy these packs intact on > an initial clone? It's ok if these packs are specially marked as being > ok to copy. These should be copied as-is. Basically, object enumeration lists every reachable object, which should include every object in this pack if its a "dense history pack". We then start to write out each object. As each object is written we look to see if it already exists in a pack. It does (in your dense history pack), so we then look to see if its delta base is also in the output list (it is), so we send the data as-is. One of the bigger costs with such clones is building that huge list of objects needed to send. The primary cost appears to be unpacking the trees from the "dense history pack", where delta chains are usually quite long. The GSoC 2009 pack caching project idea is based on the theory that we should be able to save a list of objects that are reachable from some fixed point (e.g. a very well known, stable tag), and avoid needing to read these ancient trees. But its just a theory. Caching always costs you management overheads. And it may not save us that much time . And most of the theory here is based on JGit's performance during packing, *not* git-core. I came up with the object list caching idea because JGit's object enumeration is just pitiful. (Its Java, what do you want, if you wanted fast, you'd use portable assembler... like git-core does.) Whether or not its worth applying to git-core is another story entirely. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html