On Thu, Nov 1, 2018 at 12:02 PM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 11/1/2018 2:57 PM, Elijah Newren wrote: > > On Thu, Nov 1, 2018 at 5:32 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: > >> No rush. I'd just like to understand how removing the commit-graph file > >> can make the new algorithm faster. Putting a similar count in the old > >> algorithm would involve giving a count for every call to > >> in_merge_bases_many(), which would be very noisy. > > $ time git push --dry-run --follow-tags /home/newren/repo-mirror > > count: 92912 > > To /home/newren/repo-mirror > > * [new branch] test5 -> test5 > > > > real 0m3.024s > > user 0m2.752s > > sys 0m0.320s > > Is the above test with or without the commit-graph file? Can you run it > in the other mode, too? I'd like to see if the "count" value changes > when the only difference is the presence of a commit-graph file. I apologize for providing misleading information earlier; this was an apples to oranges comparison. Here's what I did: <build a version of git with your fixes> git clone coworker.bundle coworker-repo cd coworker-repo time git push --dry-run --follow-tags /home/newren/repo-mirror git config core.commitgraph true git config gc.writecommitgraph true git gc time git push --dry-run --follow-tags /home/newren/nucleus-mirror I figured I had just done a fresh clone, so surely the gc wouldn't do anything other than write the .git/objects/info/commit-graph file. However, the original bundle contained many references outside of refs/heads/ and refs/tags/: $ git bundle list-heads ../coworker.bundle | grep -v -e refs/heads/ -e refs/tags/ -e HEAD | wc -l 2396 These other refs apparently referred to objects not otherwise referenced in refs/heads/ and refs/tags/, and caused the gc to explode lots of loose objects: $ git count-objects -v count: 147604 size: 856416 in-pack: 1180692 packs: 1 size-pack: 796143 prune-packable: 0 garbage: 0 size-garbage: 0 The slowdown with commit-graph was entirely due to there being lots of loose objects (147K of them). If I add a git-prune before doing the timing with commit-graph, then the timing with commit-graph is faster than the run without a commit-graph. Sorry for the wild goose chase. And thanks for the fixes; get_reachable_subset() makes things much faster even without a commit-graph, and the commit-graph just improves it more. :-)