This patch series is upstreaming work made by GitHub and available in: https://github.com/peff/git/commits/jk/delta-islands The above work has been already described in the following article: https://githubengineering.com/counting-objects/ The above branch contains only one patch. In this patch series the patch has been split into 5 patches (1/6 to 5/6) with their own commit message, and on top of that one patch (6/6) has been added. This patch implements something that was requested following the previous iteration. I kept Peff as the author of the first 5 patches and took the liberty to add his Signed-off-by to them. As explained in details in the Counting Object article referenced above, the goal of the delta island mechanism is for a hosting provider to make it possible to have the "forks" of a repository share as much storage as possible while preventing object packs to contain deltas between different forks. If deltas between different forks are not prevented, when users clone or fetch a fork, preparing the pack that should be sent to them can be very costly CPU wise, as objects from a different fork should not be sent, which means that a lot of deltas might need to be computed again (instead of reusing existing deltas). The following changes have been made since the previous iteration: * suggested Dscho: explain in the cover letter what the patches are all about * suggested by Peff and Junio: improve the commit messages * suggested by Junio: add comment before get_delta_base() in "packfile.h" in patch 1/6 * suggested by Duy: move 'pack.island' documentation (in "Documentation/config.txt") from patch 2/6 to patch 3/6 * suggested by Junio: improve pack.island documentation (in "Documentation/config.txt") to tell that it is an ERE in patch 3/6 * suggested by Peff: add doc about 'pack.islandCore' in patch 3/6 * suggested by Peff: add info about repacking with a big --window to avoid the delta window being clogged "Documentation/git-pack-objects.txt" in patch 3/6 * suggested by Duy: remove `#include "builtin.h"` from delta-islands.c in patch 2/6 * suggested by Duy: mark strings for translation in patch 2/6 * suggested by Peff: modernize code using ALLOC_ARRAY, QSORT() and free_tree_buffer() in patch 2/6 * suggested by Peff: use "respect islands during delta compression" as help text for --delta-islands in "builtin/pack-objects.c" in patch 3/6 * suggested by Junio: improve documentation explaining how capture groups from the pack.island regexes are concatenated in Documentation/git-pack-objects.txt in patch 3/6 * suggested by Junio: add that only up to 7 capture groups are supported in the pack.island regexes in Documentation/git-pack-objects.txt in patch 3/6 * suggested by Peff: move test script from the t99XX range to the t53XX range in commit 5/6 * suggested by Duy: move field 'tree_depth' from 'struct object_entry' to 'struct packing_data' in pack-object.h in new patch 6/6 The following changes have been suggested in the previous iteration, but have not been implemented: * suggested by Peff: rename get_delta_base() in patch 1/6 I am not sure which name to use, especially as there are a number of other functions static to "packfile.c" with a name that starts with "get_delta_base" and they should probably be renamed too. * suggested by Duy: move field 'layer' from 'struct object_entry' to 'struct packing_data' in pack-object.h I will respond in the original email about this. * suggested by Peff: using FLEX_ALLOC_MEM() in island_bitmap_new() in patch 2/6 In his email Peff says that'd waste 4 bytes per struct, so it's not worth it in my opinion. This patch series is also available on GitHub in: https://github.com/chriscool/git/commits/delta-islands The previous version is available there: https://github.com/chriscool/git/commits/delta-islands6 https://public-inbox.org/git/20180722054836.28935-1-chriscool@xxxxxxxxxxxxx/ Christian Couder (1): pack-objects: move tree_depth into 'struct packing_data' Jeff King (5): packfile: make get_delta_base() non static Add delta-islands.{c,h} pack-objects: add delta-islands support repack: add delta-islands support t: add t5319-delta-islands.sh Documentation/config.txt | 19 ++ Documentation/git-pack-objects.txt | 97 ++++++ Documentation/git-repack.txt | 5 + Makefile | 1 + builtin/pack-objects.c | 142 ++++++--- builtin/repack.c | 9 + delta-islands.c | 496 +++++++++++++++++++++++++++++ delta-islands.h | 11 + pack-objects.h | 6 + packfile.c | 10 +- packfile.h | 7 + t/t5319-delta-islands.sh | 143 +++++++++ 12 files changed, 900 insertions(+), 46 deletions(-) create mode 100644 delta-islands.c create mode 100644 delta-islands.h create mode 100755 t/t5319-delta-islands.sh -- 2.18.0.327.ga7d188ab43