This series introduces the 'git ahead-behind' builtin, which has been used at $DAYJOB for many years, but took many forms before landing on the current version. The main goal of the builtin is to compare multiple references against a common base reference. The comparison is number of commits that are in each side of the symmtric difference of their reachable sets. A commit C is "ahead" of a commit B by the number of commits in B..C (reachable from C but not reachable from B). Similarly, the commit C is "behind" the commit B by the number of commits in C..B (reachable from B but not reachable from C). These numbers can be computed by 'git rev-list --count B..C' and 'git rev-list --count C..B', but there are common needs that benefit from having the checks being done in the same process: 1. Our "branches" page lists ahead/behind counts for each listed branch as compared to the repo's default branch. This can be done with a single 'git ahead-behind' process. 2. When a branch is updated, a background job checks if any pull requests that target that branch should be closed because their branches were merged implicitly by that update. These queries can e batched into 'git ahead-behind' calls. In that second example, we don't need the full ahead/behind counts (although it is sufficient to look for branches that are "zero commits ahead", meaning they are reachable from the base), so this builtin has an extra '--contains' mode that only checks reachability from the base to each of the tips. 'git ahead-behind --contains' is sort of the reverse of 'git branch --contains'. The series starts with some basic boilerplate and argument parsing, along with error conditions for missing objects. To avoid TOCTOU races, an '--ignore-missing' option allows being flexible when a tip reference does not exist. This is all covered in patches 1-3. Patches 4-6 introduce a new method: ensure_generations_valid(). Patch 4 does some refactoring of the existing generation number computations to make it more generic, and patch 5 updates the definition of commit_graph_generation() slightly, making way for patch 6 to implement the method. With an existing commit-graph file, the commits that are not present in the file are considered as having generation number "infinity". This is useful for most of our reachability queries to this point, since those commits are "above" the ones tracked by the commit-graph. When these commits are low in number, then there is very little performance cost and zero correctness cost. However, we will see that the ahead/behind computation requires accurate generation numbers to avoid overcounting. Thus, ensure_generations_valid() is a way to specify a list of commits that need generation numbers computed before continuing. It's a no-op if all of those commits are in the commit-graph file. It's expensive if the commit-graph doesn't exist. However, 'git ahead-behind' computations are likely to be slow no matter what without a commit-graph, so assuming an existing commit-graph file is reasonable. If we find sufficient desire to have an implementation that does not have this requirement, we could create a second implementation and toggle to it when generation_numbers_enabled() returns false. Patch 7 implements the ahead-behind algorithm, as well as integrating the builtin with that implementation. It's a long commit message, so hopefully it explains the algorithm sufficiently. Patch 8 implements the --contains option, which is another algorithm, but more similar to other depth-first searches that already exist in commit-reach.c. Thanks, -Stolee Derrick Stolee (7): ahead-behind: create empty builtin ahead-behind: parse tip references ahead-behind: implement --ignore-missing option commit-graph: combine generation computations commit-graph: return generation from memory ahead-behind: implement ahead_behind() logic ahead-behind: add --contains mode Taylor Blau (1): commit-graph: introduce `ensure_generations_valid()` .gitignore | 1 + Documentation/git-ahead-behind.txt | 78 +++++++++++ Makefile | 1 + builtin.h | 1 + builtin/ahead-behind.c | 121 +++++++++++++++++ commit-graph.c | 209 +++++++++++++++++++---------- commit-graph.h | 7 + commit-reach.c | 205 ++++++++++++++++++++++++++++ commit-reach.h | 37 +++++ git.c | 1 + t/perf/p1500-graph-walks.sh | 29 ++++ t/t4218-ahead-behind.sh | 162 ++++++++++++++++++++++ t/t5318-commit-graph.sh | 2 +- t/t6600-test-reach.sh | 120 +++++++++++++++++ 14 files changed, 904 insertions(+), 70 deletions(-) create mode 100644 Documentation/git-ahead-behind.txt create mode 100644 builtin/ahead-behind.c create mode 100755 t/perf/p1500-graph-walks.sh create mode 100755 t/t4218-ahead-behind.sh base-commit: d15644fe0226af7ffc874572d968598564a230dd Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1489%2Fderrickstolee%2Fstolee%2Fupstream-ahead-behind-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1489/derrickstolee/stolee/upstream-ahead-behind-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/1489 -- gitgitgadget