[PATCH 1/2] fetch-pack: use commit-graph when computing cutoff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



During packfile negotiation we iterate over all refs announced by the
remote side to check whether their IDs refer to commits already known to
us. If a commit is known to us already, then its date is a potential
cutoff point for commits we have in common with the remote side.

There is potentially a lot of commits announced by the remote depending
on how many refs there are in the remote repository, and for every one
of them we need to search for it in our object database and, if found,
parse the corresponding object to find out whether it is a candidate for
the cutoff date. This can be sped up by trying to look up commits via
the commit-graph first, which is a lot more efficient.

One thing to keep in mind though is that the commit-graph corrects
committer dates:

    * A commit with at least one parent has corrected committer date
      equal to the maximum of its commiter date and one more than the
      largest corrected committer date among its parents.

As a result, it may be that the commit date we load via the commit graph
is more recent than it would have been when loaded via the ODB, and as a
result we may also choose a more recent cutoff point. But as the code
documents, this is only a heuristic and it is okay if we determine a
wrong cutoff date. The worst that can happen is that we report more
commits as HAVEs to the server when using corrected dates.

Loading commits via the commit-graph is typically much faster than
loading commits via the object database. Benchmarks in a repository with
about 2,1 million refs and an up-to-date commit-graph show a 20% speedup
when mirror-fetching:

    Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
      Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
      Range (min … max):   74.145 s … 76.862 s    5 runs

    Benchmark 2: git fetch --atomic +refs/*:refs/* (HEAD)
      Time (mean ± σ):     62.350 s ±  0.854 s    [User: 55.412 s, System: 9.976 s]
      Range (min … max):   61.224 s … 63.216 s    5 runs

    Summary
      'git fetch --atomic +refs/*:refs/* (HEAD)' ran
        1.21 ± 0.02 times faster than 'git fetch --atomic +refs/*:refs/* (v2.35.0)'

Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
---
 fetch-pack.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index dd6ec449f2..c5967e228e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -696,26 +696,30 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 	for (ref = *refs; ref; ref = ref->next) {
-		struct object *o;
+		struct commit *commit;
 
-		if (!has_object_file_with_flags(&ref->old_oid,
+		commit = lookup_commit_in_graph(the_repository, &ref->old_oid);
+		if (!commit) {
+			struct object *o;
+
+			if (!has_object_file_with_flags(&ref->old_oid,
 						OBJECT_INFO_QUICK |
-							OBJECT_INFO_SKIP_FETCH_OBJECT))
-			continue;
-		o = parse_object(the_repository, &ref->old_oid);
-		if (!o)
-			continue;
+						OBJECT_INFO_SKIP_FETCH_OBJECT))
+				continue;
+			o = parse_object(the_repository, &ref->old_oid);
+			if (!o || o->type != OBJ_COMMIT)
+				continue;
+
+			commit = (struct commit *)o;
+		}
 
 		/*
 		 * We already have it -- which may mean that we were
 		 * in sync with the other side at some time after
 		 * that (it is OK if we guess wrong here).
 		 */
-		if (o->type == OBJ_COMMIT) {
-			struct commit *commit = (struct commit *)o;
-			if (!cutoff || cutoff < commit->date)
-				cutoff = commit->date;
-		}
+		if (!cutoff || cutoff < commit->date)
+			cutoff = commit->date;
 	}
 	trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 
-- 
2.35.0

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux