Re: [PATCH] fetch-pack: speed up loading of refs via commit graph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/4/2021 9:56 AM, Patrick Steinhardt wrote:
> When doing reference negotiation, git-fetch-pack(1) is loading all refs
> from disk in order to determine which commits it has in common with the
> remote repository. This can be quite expensive in repositories with many
> references though: in a real-world repository with around 2.2 million
> refs, fetching a single commit by its ID takes around 44 seconds.
> 
> Dominating the loading time is decompression and parsing of the objects
> which are referenced by commits. Given the fact that we only care about
> commits (or tags which can be peeled to one) in this context, there is
> thus an easy performance win by switching the parsing logic to make use
> of the commit graph in case we have one available.

Nice find!

> Like this, we avoid
> hitting the object database to parse these commits but instead only load
> them from the commit-graph. This results in a significant performance
> boost when executing git-fetch in said repository with 2.2 million refs:
> 
>     Benchmark #1: HEAD~: git fetch $remote $commit
>       Time (mean ± σ):     44.168 s ±  0.341 s    [User: 42.985 s, System: 1.106 s]
>       Range (min … max):   43.565 s … 44.577 s    10 runs
> 
>     Benchmark #2: HEAD: git fetch $remote $commit
>       Time (mean ± σ):     19.498 s ±  0.724 s    [User: 18.751 s, System: 0.690 s]
>       Range (min … max):   18.629 s … 20.454 s    10 runs
> 
>     Summary
>       'HEAD: git fetch $remote $commit' ran
>         2.27 ± 0.09 times faster than 'HEAD~: git fetch $remote $commit'

That's a great improvement. I'm sure that the remaining time is dominated
by ref parsing.

> -	if (type == OBJ_COMMIT)
> -		return (struct commit *) parse_object(the_repository, oid);
> +
> +	if (type == OBJ_COMMIT) {
> +		struct commit *commit = lookup_commit(the_repository, oid);
> +		if (!commit || repo_parse_commit(the_repository, commit))
> +			return NULL;
> +		return commit;
> +	}
> +

And this change looks obviously correct to me. I'm glad that the
implementation actually doesn't care about the commit-graph, but
instead cares about using the "standard" parsing approach instead
of side-stepping the commit-graph via parse_object().

I took a quick look for other instances where we use parse_object()
but also know to expect a commit, but did not find one.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux