This patch series performs a decently-sized refactoring of the revision-walk machinery. Well, "refactoring" is probably the wrong word, as I don't actually remove the old code. Instead, when we see certain options in the 'rev_info' struct, we redirect the commit-walk logic to a new set of methods that distribute the workload differently. By using generation numbers in the commit-graph, we can significantly improve 'git log --graph' commands (and the underlying 'git rev-list --topo-order'). On the Linux repository, I got the following performance results when comparing to the previous version with or without a commit-graph: Test: git rev-list --topo-order -100 HEAD HEAD~1, no commit-graph: 6.80 s HEAD~1, w/ commit-graph: 0.77 s HEAD, w/ commit-graph: 0.02 s Test: git rev-list --topo-order -100 HEAD -- tools HEAD~1, no commit-graph: 9.63 s HEAD~1, w/ commit-graph: 6.06 s HEAD, w/ commit-graph: 0.06 s If you want to read this series but are unfamiliar with the commit-graph and generation numbers, then I recommend reading `Documentation/technical/commit-graph.txt` or a blog post [1] I wrote on the subject. In particular, the three-part walk described in "revision.c: refactor basic topo-order logic" is present (but underexplained) as an animated PNG [2]. **UPDATED** Now that we have had some review and some dogfooding, I'm removing the paragraph I had here about "RFC quality". I think this is ready to merge! One notable case that is not included in this series is the case of a history comparison such as 'git rev-list --topo-order A..B'. The existing code in limit_list() has ways to cut the walk short when all pending commits are UNINTERESTING. Since this code depends on commit_list instead of the prio_queue we are using here, I chose to leave it untouched for now. We can revisit it in a separate series later. Since handle_commit() turns on revs->limited when a commit is UNINTERESTING, we do not hit the new code in this case. Removing this 'revs->limited = 1;' line yields correct results, but the performance can be worse. **UPDATED** See the discussion about Generation Number V2 [4] for more on this topic. Changes in V5: Thanks Jakub for feedback on the huge commit! I think I've responded to all the code feedback. See the range-diff at the end of this cover-page. Thanks, -Stolee [1] https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/ Supercharging the Git Commit Graph III: Generations and Graph Algorithms [2] https://msdnshared.blob.core.windows.net/media/2018/06/commit-graph-topo-order-b-a.png Animation showing three-part walk [3] https://github.com/derrickstolee/git/tree/topo-order/test A branch containing this series along with commits to compute commit-graph in entire test suite. [4] https://public-inbox.org/git/6367e30a-1b3a-4fe9-611b-d931f51effef@xxxxxxxxx/ [RFC] Generation Number v2 Note: I'm not submitting this version via GitGitGadget because it's currently struggling with how to handle a PR in a conflict state. The new flags in revision.h have a conflict with recent changes in master. Derrick Stolee (7): prio-queue: add 'peek' operation test-reach: add run_three_modes method test-reach: add rev-list tests revision.c: begin refactoring --topo-order logic commit/revisions: bookkeeping before refactoring revision.c: generation-based topo-order algorithm t6012: make rev-list tests more interesting commit.c | 9 +- commit.h | 7 + object.h | 4 +- prio-queue.c | 9 ++ prio-queue.h | 6 + revision.c | 243 +++++++++++++++++++++++++++++++++-- revision.h | 6 + t/helper/test-prio-queue.c | 26 ++-- t/t0009-prio-queue.sh | 14 ++ t/t6012-rev-list-simplify.sh | 45 +++++-- t/t6600-test-reach.sh | 96 +++++++++++++- 11 files changed, 426 insertions(+), 39 deletions(-) base-commit: 2d3b1c576c85b7f5db1f418907af00ab88e0c303 -- 2.19.1.542.gc4df23f792 -->8-- 1: 2358cfd5ed = 1: 7c75a56505 prio-queue: add 'peek' operation 2: 3a4b68e479 = 2: 686c4370de test-reach: add run_three_modes method 3: 12a3f6d367 = 3: 7410c00982 test-reach: add rev-list tests 4: cd9eef9688 = 4: 5439e11e37 revision.c: begin refactoring --topo-order logic 5: f3e291665d ! 5: 71554deb9b commit/revisions: bookkeeping before refactoring @@ -9,8 +9,8 @@ compare_commits_by_author_date() in revision.c. These are used currently by sort_in_topological_order() in commit.c. - 2. Moving these methods to commit.h requires adding the author_slab - definition to commit.h. + 2. Moving these methods to commit.h requires adding an author_date_slab + declaration to commit.h. Consumers will need their own implementation. 3. The add_parents_to_list() method in revision.c performs logic around the UNINTERESTING flag and other special cases depending @@ -31,8 +31,7 @@ define_commit_slab(indegree_slab, int); -/* record author-date for each commit object */ --define_commit_slab(author_date_slab, timestamp_t); -+implement_shared_commit_slab(author_date_slab, timestamp_t); + define_commit_slab(author_date_slab, timestamp_t); -static void record_author_date(struct author_date_slab *author_date, - struct commit *commit) @@ -69,8 +68,7 @@ extern int check_commit_signature(const struct commit *commit, struct signature_check *sigc); +/* record author-date for each commit object */ -+define_shared_commit_slab(author_date_slab, timestamp_t); -+ ++struct author_date_slab; +void record_author_date(struct author_date_slab *author_date, + struct commit *commit); + 6: aa0bb2221d ! 6: 84c142e0bc revision.c: generation-based topo-order algorithm @@ -195,6 +195,7 @@ -struct topo_walk_info {}; +define_commit_slab(indegree_slab, int); ++define_commit_slab(author_date_slab, timestamp_t); + +struct topo_walk_info { + uint32_t min_generation; @@ -243,12 +244,12 @@ +} + +static void explore_to_depth(struct rev_info *revs, -+ uint32_t gen) ++ uint32_t gen_cutoff) +{ + struct topo_walk_info *info = revs->topo_walk_info; + struct commit *c; + while ((c = prio_queue_peek(&info->explore_queue)) && -+ c->generation >= gen) ++ c->generation >= gen_cutoff) + explore_walk_step(revs); +} + @@ -266,9 +267,6 @@ + + explore_to_depth(revs, c->generation); + -+ if (parse_commit_gently(c, 1) < 0) -+ return; -+ + for (p = c->parents; p; p = p->next) { + struct commit *parent = p->item; + int *pi = indegree_slab_at(&info->indegree, parent); @@ -285,12 +283,13 @@ + } +} + -+static void compute_indegrees_to_depth(struct rev_info *revs) ++static void compute_indegrees_to_depth(struct rev_info *revs, ++ uint32_t gen_cutoff) +{ + struct topo_walk_info *info = revs->topo_walk_info; + struct commit *c; + while ((c = prio_queue_peek(&info->indegree_queue)) && -+ c->generation >= info->min_generation) ++ c->generation >= gen_cutoff) + indegree_walk_step(revs); +} @@ -305,9 +304,9 @@ - limit_list(revs); - sort_in_topological_order(&revs->commits, revs->sort_order); + init_indegree_slab(&info->indegree); -+ memset(&info->explore_queue, '\0', sizeof(info->explore_queue)); -+ memset(&info->indegree_queue, '\0', sizeof(info->indegree_queue)); -+ memset(&info->topo_queue, '\0', sizeof(info->topo_queue)); ++ memset(&info->explore_queue, 0, sizeof(info->explore_queue)); ++ memset(&info->indegree_queue, 0, sizeof(info->indegree_queue)); ++ memset(&info->topo_queue, 0, sizeof(info->topo_queue)); + + switch (revs->sort_order) { + default: /* REV_SORT_IN_GRAPH_ORDER */ @@ -329,23 +328,22 @@ + info->min_generation = GENERATION_NUMBER_INFINITY; + for (list = revs->commits; list; list = list->next) { + struct commit *c = list->item; -+ test_flag_and_insert(&info->explore_queue, c, TOPO_WALK_EXPLORED); -+ test_flag_and_insert(&info->indegree_queue, c, TOPO_WALK_INDEGREE); + + if (parse_commit_gently(c, 1)) + continue; ++ ++ test_flag_and_insert(&info->explore_queue, c, TOPO_WALK_EXPLORED); ++ test_flag_and_insert(&info->indegree_queue, c, TOPO_WALK_INDEGREE); ++ + if (c->generation < info->min_generation) + info->min_generation = c->generation; -+ } + -+ for (list = revs->commits; list; list = list->next) { -+ struct commit *c = list->item; + *(indegree_slab_at(&info->indegree, c)) = 1; + + if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE) + record_author_date(&info->author_date, c); + } -+ compute_indegrees_to_depth(revs); ++ compute_indegrees_to_depth(revs, info->min_generation); + + for (list = revs->commits; list; list = list->next) { + struct commit *c = list->item; @@ -385,9 +383,8 @@ + if (process_parents(revs, commit, NULL, NULL) < 0) { if (!revs->ignore_missing_links) die("Failed to traverse parents of commit %s", -- oid_to_hex(&commit->object.oid)); -+ oid_to_hex(&commit->object.oid)); -+ } + oid_to_hex(&commit->object.oid)); + } + + for (p = commit->parents; p; p = p->next) { + struct commit *parent = p->item; @@ -398,7 +395,7 @@ + + if (parent->generation < info->min_generation) { + info->min_generation = parent->generation; -+ compute_indegrees_to_depth(revs); ++ compute_indegrees_to_depth(revs, info->min_generation); + } + + pi = indegree_slab_at(&info->indegree, parent); @@ -409,9 +406,10 @@ + + if (revs->first_parent_only) + return; - } ++ } } + int prepare_revision_walk(struct rev_info *revs) diff --git a/revision.h b/revision.h --- a/revision.h 7: a21febe112 = 7: 5479087812 t6012: make rev-list tests more interesting