Re: [PATCH 1/6] fetch: speed up lookup of want refs via commit-graph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/20/2021 6:08 AM, Patrick Steinhardt wrote:
> When updating our local refs based on the refs fetched from the remote,
> we need to iterate through all requested refs and load their respective
> commits such that we can determine whether they need to be appended to
> FETCH_HEAD or not. In cases where we're fetching from a remote with
> exceedingly many refs, resolving these refs can be quite expensive given
> that we repeatedly need to unpack object headers for each of the
> referenced objects.
> 
> Speed this up by opportunistcally trying to resolve object IDs via the
> commit graph: more likely than not, they're going to be a commit anyway,
> and this lets us avoid having to unpack object headers completely in
> case the object is a commit that is part of the commit-graph. This
> significantly speeds up mirror-fetches in a real-world repository with
> 2.3M refs:
> 
>     Benchmark #1: HEAD~: git-fetch
>       Time (mean ± σ):     56.942 s ±  0.449 s    [User: 53.360 s, System: 5.356 s]
>       Range (min … max):   56.372 s … 57.533 s    5 runs
> 
>     Benchmark #2: HEAD: git-fetch
>       Time (mean ± σ):     33.657 s ±  0.167 s    [User: 30.302 s, System: 5.181 s]
>       Range (min … max):   33.454 s … 33.844 s    5 runs
> 
>     Summary
>       'HEAD: git-fetch' ran
>         1.69 ± 0.02 times faster than 'HEAD~: git-fetch'

These numbers are impressive, and it makes sense that performing a
binary search on the OID lookup chunk of the commit-graph is faster
than doing a binary search on the OIDs across the pack-index(es).

I do worry about the case where annotated tags greatly outnumber
branches, so this binary search is extra overhead and the performance
may degrade. Would it be worth checking the ref to see if it lies
within "refs/heads/" (or even _not_ in "refs/tags/") before doing
this commit-graph check?

> -			commit = lookup_commit_reference_gently(the_repository,
> -								&rm->old_oid,
> -								1);
> -			if (!commit)
> -				rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE;
> +			commit = lookup_commit_in_graph(the_repository, &rm->old_oid);
> +			if (!commit) {
> +				commit = lookup_commit_reference_gently(the_repository,
> +									&rm->old_oid,
> +									1);
> +				if (!commit)
> +					rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE;

nit: I wouldn't nest this last "if (!commit)".

Code will work as advertised.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux