On 8/4/2021 9:56 AM, Patrick Steinhardt wrote: > When doing reference negotiation, git-fetch-pack(1) is loading all refs > from disk in order to determine which commits it has in common with the > remote repository. This can be quite expensive in repositories with many > references though: in a real-world repository with around 2.2 million > refs, fetching a single commit by its ID takes around 44 seconds. > > Dominating the loading time is decompression and parsing of the objects > which are referenced by commits. Given the fact that we only care about > commits (or tags which can be peeled to one) in this context, there is > thus an easy performance win by switching the parsing logic to make use > of the commit graph in case we have one available. Nice find! > Like this, we avoid > hitting the object database to parse these commits but instead only load > them from the commit-graph. This results in a significant performance > boost when executing git-fetch in said repository with 2.2 million refs: > > Benchmark #1: HEAD~: git fetch $remote $commit > Time (mean ± σ): 44.168 s ± 0.341 s [User: 42.985 s, System: 1.106 s] > Range (min … max): 43.565 s … 44.577 s 10 runs > > Benchmark #2: HEAD: git fetch $remote $commit > Time (mean ± σ): 19.498 s ± 0.724 s [User: 18.751 s, System: 0.690 s] > Range (min … max): 18.629 s … 20.454 s 10 runs > > Summary > 'HEAD: git fetch $remote $commit' ran > 2.27 ± 0.09 times faster than 'HEAD~: git fetch $remote $commit' That's a great improvement. I'm sure that the remaining time is dominated by ref parsing. > - if (type == OBJ_COMMIT) > - return (struct commit *) parse_object(the_repository, oid); > + > + if (type == OBJ_COMMIT) { > + struct commit *commit = lookup_commit(the_repository, oid); > + if (!commit || repo_parse_commit(the_repository, commit)) > + return NULL; > + return commit; > + } > + And this change looks obviously correct to me. I'm glad that the implementation actually doesn't care about the commit-graph, but instead cares about using the "standard" parsing approach instead of side-stepping the commit-graph via parse_object(). I took a quick look for other instances where we use parse_object() but also know to expect a commit, but did not find one. Thanks, -Stolee