On 12/22/2019 4:32 AM, Jeff King wrote: > Looking at the diff of commit objects in pack order is much faster than > in sha1 order, as it gives locality to the access of tree deltas > (whereas sha1 order is effectively random). Unfortunately the > commit-graph code sorts the commits (several times, sometimes as an oid > and sometimes a pointer-to-commit), and we ultimately traverse in sha1 > order. > > Instead, let's remember the position at which we see each commit, and > traverse in that order when looking at bloom filters. This drops my time > for "git commit-graph write --changed-paths" in linux.git from ~4 > minutes to ~1.5 minutes. I'm doing my own perf tests on these patches, and my copy of linux.git has four packs of varying sizes (corresponding with my rare fetches and lack of repacks). My time goes from 3m50s to 3m00s. I was confused at first, but then realized that I used the "--reachable" flag. In that case, we never run set_commit_pos(), so all positions are equal and the sort is not helpful. I thought that inserting some set_commit_pos() calls into close_reachable() and add_missing_parents() would give some amount of time-order to the commits as we compute the filters. However, the time did not change at all. I've included the patch below for reference, anyway. Thanks, -Stolee -->8-- >From e7c63d8db09be81ce213ba7f112bb3d2f537bf4a Mon Sep 17 00:00:00 2001 From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> Date: Fri, 27 Dec 2019 09:47:49 -0500 Subject: [PATCH] commit-graph: set commit positions for --reachable When running 'git commit-graph write --changed-paths', we sort the commits by pack-order to save time when computing the changed-paths bloom filters. This does not help when finding the commits via the --reachable flag. Add calls to set_commit_pos() when walking the reachable commits, which provides an ordering similar to a topological ordering. Unfortunately, the performance did not improve with this change. Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> --- commit-graph.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/commit-graph.c b/commit-graph.c index bf6c663772..a6c4ab401e 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -1126,6 +1126,8 @@ static void add_missing_parents(struct write_commit_graph_context *ctx, struct c oidcpy(&ctx->oids.list[ctx->oids.nr], &(parent->item->object.oid)); ctx->oids.nr++; parent->item->object.flags |= REACHABLE; + + set_commit_pos(ctx->r, &parent->item->object.oid); } } } @@ -1142,8 +1144,10 @@ static void close_reachable(struct write_commit_graph_context *ctx) for (i = 0; i < ctx->oids.nr; i++) { display_progress(ctx->progress, i + 1); commit = lookup_commit(ctx->r, &ctx->oids.list[i]); - if (commit) + if (commit) { commit->object.flags |= REACHABLE; + set_commit_pos(ctx->r, &commit->object.oid); + } } stop_progress(&ctx->progress); -- 2.25.0.rc0