Re: [PATCH 1/3] commit-graph: examine changed-path objects in pack order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/22/2019 4:32 AM, Jeff King wrote:
> Looking at the diff of commit objects in pack order is much faster than
> in sha1 order, as it gives locality to the access of tree deltas
> (whereas sha1 order is effectively random). Unfortunately the
> commit-graph code sorts the commits (several times, sometimes as an oid
> and sometimes a pointer-to-commit), and we ultimately traverse in sha1
> order.
> 
> Instead, let's remember the position at which we see each commit, and
> traverse in that order when looking at bloom filters. This drops my time
> for "git commit-graph write --changed-paths" in linux.git from ~4
> minutes to ~1.5 minutes.

I'm doing my own perf tests on these patches, and my copy of linux.git
has four packs of varying sizes (corresponding with my rare fetches and
lack of repacks). My time goes from 3m50s to 3m00s. I was confused at
first, but then realized that I used the "--reachable" flag. In that
case, we never run set_commit_pos(), so all positions are equal and the
sort is not helpful.

I thought that inserting some set_commit_pos() calls into close_reachable()
and add_missing_parents() would give some amount of time-order to the
commits as we compute the filters. However, the time did not change at
all.

I've included the patch below for reference, anyway.

Thanks,
-Stolee

-->8--

>From e7c63d8db09be81ce213ba7f112bb3d2f537bf4a Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
Date: Fri, 27 Dec 2019 09:47:49 -0500
Subject: [PATCH] commit-graph: set commit positions for --reachable

When running 'git commit-graph write --changed-paths', we sort the
commits by pack-order to save time when computing the changed-paths
bloom filters. This does not help when finding the commits via the
--reachable flag.

Add calls to set_commit_pos() when walking the reachable commits,
which provides an ordering similar to a topological ordering.

Unfortunately, the performance did not improve with this change.

Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
---
 commit-graph.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index bf6c663772..a6c4ab401e 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1126,6 +1126,8 @@ static void add_missing_parents(struct write_commit_graph_context *ctx, struct c
 			oidcpy(&ctx->oids.list[ctx->oids.nr], &(parent->item->object.oid));
 			ctx->oids.nr++;
 			parent->item->object.flags |= REACHABLE;
+
+			set_commit_pos(ctx->r, &parent->item->object.oid);
 		}
 	}
 }
@@ -1142,8 +1144,10 @@ static void close_reachable(struct write_commit_graph_context *ctx)
 	for (i = 0; i < ctx->oids.nr; i++) {
 		display_progress(ctx->progress, i + 1);
 		commit = lookup_commit(ctx->r, &ctx->oids.list[i]);
-		if (commit)
+		if (commit) {
 			commit->object.flags |= REACHABLE;
+			set_commit_pos(ctx->r, &commit->object.oid);
+		}
 	}
 	stop_progress(&ctx->progress);
 
-- 
2.25.0.rc0




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux