On Sun, Dec 22, 2013 at 09:55:23PM +0000, Ben Maurer wrote: > One issue with this approach is that it seems git-pack-index doesn't > perform as well with thin packs. git-index-pack uses a multi-threaded > approach to resolving the deltas. However, the multithreading only > works on deltas that are exclusively in the pack. After the > multi-threaded phase, it incrementally brings in objects from external > packs, but in single threaded manner. Many objects in the pack have > some dependency on an external object, therefore, defeating the > multithreading. Yes. It will also just plain perform worse, because it will have to copy over more external objects. This is somewhat made up for getting an actual smaller pack size, but I suspect the completed thin-pack ends up larger than what the server would otherwise send. Because the server is blindly reusing on-disk deltas (which is good, because it takes load off of the server), it misses good delta opportunities between objects in the sent pack (which are likely almost as small, but would not require fixing on the other end). Single-threading the extra work we have to do just exacerbates the problem, of course. Still, I think it will be a net win for end-to-end wall clock time of the operation. You are saving CPU time on the server end, and you're saving network bandwidth with a smaller pack. In my tests on torvalds/linux, doing a fetch across a local machine (so basically discounting network improvements), the times look like (this is end-to-end, counting both server and client CPU time): [vanilla] real 0m3.850s user 0m7.504s sys 0m0.380s [patched] real 0m2.785s user 0m2.472s sys 0m0.180s So it was a win both for wall-clock and CPU. > What's the use case for a pack file with a SHA1 reference living > inside the pack file (why not just use an offset?) Would it make sense > to assume that all such files are external and bring them in in the > first phase. Once upon a time, ref-delta was the only format supported by packfiles. Later, delta-base-offset was invented, and the client and server negotiate the use of the feature before the packfile is generated (and even when we reuse objects, pack-objects will rewrite the header on the fly to use ref-delta if necessary). These days, pretty much everybody supports delta-base-offset, so I don't think there is any reason index-pack should see ref-delta for a non-thin object. We could probably teach index-pack an "--assume-refs-are-thin" option to optimize for this case, and have fetch-pack/receive-pack pass it whenever they know that delta-base-offset was negotiated. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html