This series implements JGit-style pack bitmaps to speed up fetching and cloning. For example, here is a simulation of the server side of a clone of a fully-packed kernel repo (measuring actual clones is harder, because the client does a lot of work on resolving deltas): [before] $ time git pack-objects --all --stdout </dev/null >/dev/null Counting objects: 3237103, done. Compressing objects: 100% (508752/508752), done. Total 3237103 (delta 2699584), reused 3237103 (delta 2699584) real 0m44.111s user 0m42.396s sys 0m3.544s [after] $ time git pack-objects --all --stdout </dev/null >/dev/null Reusing existing pack: 3237103, done. Total 3237103 (delta 0), reused 0 (delta 0) real 0m1.636s user 0m1.460s sys 0m0.172s This helps eliminate load on the server side, but it also means that we actually start transferring objects way faster, which means the clones finish faster. If you look at current clones of torvalds/linux from kernel.org, it's almost two minutes before they actually start sending you any data, during which time the client is twiddling its thumbs. The bitmaps implemented here are compatible with those produced by JGit. We can read JGit-produced bitmaps, and JGit can read ours. The one exception is the final patch, which adds an optional name-hash cache. It's added in such a way that existing implementations can ignore it, and is marked with a flag in the header. However, JGit is very picky about the "flags" field; it will reject any bitmap index with a flag it does not know about. The patches are: [01/19]: sha1write: make buffer const-correct [02/19]: revindex: Export new APIs [03/19]: pack-objects: Refactor the packing list [04/19]: pack-objects: factor out name_hash [05/19]: revision: allow setting custom limiter function [06/19]: sha1_file: export `git_open_noatime` [07/19]: compat: add endianness helpers [08/19]: ewah: compressed bitmap implementation Refactoring and support for the rest of the series. [09/19]: documentation: add documentation for the bitmap format [10/19]: pack-bitmap: add support for bitmap indexes [11/19]: pack-objects: use bitmaps when packing objects [12/19]: rev-list: add bitmap mode to speed up object lists Bitmap reading (you can test it against JGit at this point by running "jgit debug-gc", and then cloning or running rev-list). [13/19]: pack-objects: implement bitmap writing [14/19]: repack: stop using magic number for ARRAY_SIZE(exts) [15/19]: repack: turn exts array into array-of-struct [16/19]: repack: handle optional files created by pack-objects [17/19]: repack: consider bitmaps when performing repacks Bitmap writing (you can test against JGit by running "git repack -adb", and then running "jgit daemon" to serve the result). [18/19]: t: add basic bitmap functionality tests With reading and writing, we can do our own tests. [19/19]: pack-bitmap: implement optional name_hash cache And this is our extension. A similar series has been running on github.com for the past couple of months, though not every repository has had bitmaps turned on (but some very busy ones have). We've hopefully squeezed out all of the bugs and corner cases over that time. However, I did rebase this on a more modern version of "master"; among other conflicts, this required porting the git-repack changes from shell to C. So it's entirely possible I've introduced new bugs. :) The idea and original implementation for bitmaps comes from Shawn and Colby, of course. The hard work in this series was done by Vicent Marti, and he is credited as the author in most of the patches. I've added some window dressing and helped a little with debugging and review. But along with Vicent, I should be able to help with answering questions for review, and as time goes on, I'm familiar enough with the code to deal with bugs and reviewing future changes. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html