On Wed, Nov 28 2018, Derrick Stolee via GitGitGadget wrote: > One of the biggest remaining pain points for users of very large > repositories is the time it takes to run 'git push'. We inspected some slow > pushes by our developers and found that the "Enumerating Objects" phase of a > push was very slow. This is unsurprising, because this is why reachability > bitmaps exist. However, reachability bitmaps are not available to us because > of the single pack-file requirement. The bitmap approach is intended for > servers anyway, and clients have a much different behavior pattern. > > Specifically, clients are normally pushing a very small number of objects > compared to the entire working directory. A typical user changes only a > small cone of the working directory, so let's use that to our benefit. > > Create a new "sparse" mode for 'git pack-objects' that uses the paths that > introduce new objects to direct our search into the reachable trees. By > collecting trees at each path, we can then recurse into a path only when > there are uninteresting and interesting trees at that path. This gains a > significant performance boost for small topics while presenting a > possibility of packing extra objects. > > The main algorithm change is in patch 4, but is set up a little bit in > patches 1 and 2. > > As demonstrated in the included test script, we see that the existing > algorithm can send extra objects due to the way we specify the "frontier". > But we can send even more objects if a user copies objects from one folder > to another. I say "copy" because a rename would (usually) change the > original folder and trigger a walk into that path, discovering the objects. > > In order to benefit from this approach, the user can opt-in using the > pack.useSparse config setting. This setting can be overridden using the > '--no-sparse' option. This is really interesting. I tested this with: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 124b1bafc4..5c7615f06c 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3143 +3143 @@ static void get_object_list(int ac, const char **av) - mark_edges_uninteresting(&revs, show_edge, sparse); + mark_edges_uninteresting(&revs, show_edge, 1); To emulate having a GIT_TEST_* mode for this, which seems like a good idea since it turned up a lot of segfaults in pack-objects. I wasn't able to get a backtrace for that since it always happens indirectly, and I didn't dig enough to see how to manually invoke it the right way.