This patch series is rewriting the code that tries to reuse existing packfiles. The code in this patch series was written by GitHub, and Peff nicely provided it in the following discussion: https://public-inbox.org/git/3E56B0FD-EBE8-4057-A93A-16EBB09FBCE0@xxxxxxxxxxxxxx/ The first versions of this patch series were also discussed: v3: https://public-inbox.org/git/20191115141541.11149-1-chriscool@xxxxxxxxxxxxx/ V2: https://public-inbox.org/git/20191019103531.23274-1-chriscool@xxxxxxxxxxxxx/ V1: https://public-inbox.org/git/20190913130226.7449-1-chriscool@xxxxxxxxxxxxx/ Thanks to the reviewers! According to Peff this new code is a lot smarter than what it replaces. It allows "holes" in the chunks of packfile to be reused, and skips over them. It rewrites OFS_DELTA offsets as it goes to account for the holes. So it's basically a linear walk over the packfile, but with the important distinction that we don't add those objects to the object_entry array, which makes them very lightweight (especially in memory use, but they also aren't considered bases for finding new deltas, etc). It seems like a good compromise between the cost to serve a clone and the quality of the resulting packfile. This series has been rebased onto current master ad05a3d8e5 (The fifth batch, 2019-12-10). Other changes since the previous patch series have all been suggested by Peff. Thanks to him! They are the following: - Add note in commit message of patch 3/12. - Move previous patch 4/9 to patch 12/12 at the end of the series to avoid test failures. - Add new patches 5/12 and 6/12. - Improve commit message and documentation of pack.allowPackReuse in patch 8/12. - Improve commit message of patch 10/12. - Extract patch 11/12 from patch 10/12. Jeff King (12): builtin/pack-objects: report reused packfile objects packfile: expose get_delta_base() ewah/bitmap: introduce bitmap_word_alloc() pack-bitmap: introduce bitmap_walk_contains() pack-bitmap: uninteresting oid can be outside bitmapped packfile pack-bitmap: simplify bitmap_has_oid_in_uninteresting() csum-file: introduce hashfile_total() pack-objects: introduce pack.allowPackReuse builtin/pack-objects: introduce obj_is_packed() pack-objects: improve partial packfile reuse pack-objects: add checks for duplicate objects pack-bitmap: don't rely on bitmap_git->reuse_objects Documentation/config/pack.txt | 7 + builtin/pack-objects.c | 243 +++++++++++++++++++++++++++------- csum-file.h | 9 ++ ewah/bitmap.c | 13 +- ewah/ewok.h | 1 + pack-bitmap.c | 192 ++++++++++++++++++--------- pack-bitmap.h | 6 +- packfile.c | 10 +- packfile.h | 3 + 9 files changed, 362 insertions(+), 122 deletions(-) -- 2.24.1.498.g561400140f