[PATCH 0/8] ahead-behind: new builtin for counting multiple commit ranges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This series introduces the 'git ahead-behind' builtin, which has been used
at $DAYJOB for many years, but took many forms before landing on the current
version.

The main goal of the builtin is to compare multiple references against a
common base reference. The comparison is number of commits that are in each
side of the symmtric difference of their reachable sets. A commit C is
"ahead" of a commit B by the number of commits in B..C (reachable from C but
not reachable from B). Similarly, the commit C is "behind" the commit B by
the number of commits in C..B (reachable from B but not reachable from C).

These numbers can be computed by 'git rev-list --count B..C' and 'git
rev-list --count C..B', but there are common needs that benefit from having
the checks being done in the same process:

 1. Our "branches" page lists ahead/behind counts for each listed branch as
    compared to the repo's default branch. This can be done with a single
    'git ahead-behind' process.
 2. When a branch is updated, a background job checks if any pull requests
    that target that branch should be closed because their branches were
    merged implicitly by that update. These queries can e batched into 'git
    ahead-behind' calls.

In that second example, we don't need the full ahead/behind counts (although
it is sufficient to look for branches that are "zero commits ahead", meaning
they are reachable from the base), so this builtin has an extra '--contains'
mode that only checks reachability from the base to each of the tips. 'git
ahead-behind --contains' is sort of the reverse of 'git branch --contains'.

The series starts with some basic boilerplate and argument parsing, along
with error conditions for missing objects. To avoid TOCTOU races, an
'--ignore-missing' option allows being flexible when a tip reference does
not exist. This is all covered in patches 1-3.

Patches 4-6 introduce a new method: ensure_generations_valid(). Patch 4 does
some refactoring of the existing generation number computations to make it
more generic, and patch 5 updates the definition of
commit_graph_generation() slightly, making way for patch 6 to implement the
method. With an existing commit-graph file, the commits that are not present
in the file are considered as having generation number "infinity". This is
useful for most of our reachability queries to this point, since those
commits are "above" the ones tracked by the commit-graph. When these commits
are low in number, then there is very little performance cost and zero
correctness cost.

However, we will see that the ahead/behind computation requires accurate
generation numbers to avoid overcounting. Thus, ensure_generations_valid()
is a way to specify a list of commits that need generation numbers computed
before continuing. It's a no-op if all of those commits are in the
commit-graph file. It's expensive if the commit-graph doesn't exist.
However, 'git ahead-behind' computations are likely to be slow no matter
what without a commit-graph, so assuming an existing commit-graph file is
reasonable. If we find sufficient desire to have an implementation that does
not have this requirement, we could create a second implementation and
toggle to it when generation_numbers_enabled() returns false.

Patch 7 implements the ahead-behind algorithm, as well as integrating the
builtin with that implementation. It's a long commit message, so hopefully
it explains the algorithm sufficiently.

Patch 8 implements the --contains option, which is another algorithm, but
more similar to other depth-first searches that already exist in
commit-reach.c.

Thanks, -Stolee

Derrick Stolee (7):
  ahead-behind: create empty builtin
  ahead-behind: parse tip references
  ahead-behind: implement --ignore-missing option
  commit-graph: combine generation computations
  commit-graph: return generation from memory
  ahead-behind: implement ahead_behind() logic
  ahead-behind: add --contains mode

Taylor Blau (1):
  commit-graph: introduce `ensure_generations_valid()`

 .gitignore                         |   1 +
 Documentation/git-ahead-behind.txt |  78 +++++++++++
 Makefile                           |   1 +
 builtin.h                          |   1 +
 builtin/ahead-behind.c             | 121 +++++++++++++++++
 commit-graph.c                     | 209 +++++++++++++++++++----------
 commit-graph.h                     |   7 +
 commit-reach.c                     | 205 ++++++++++++++++++++++++++++
 commit-reach.h                     |  37 +++++
 git.c                              |   1 +
 t/perf/p1500-graph-walks.sh        |  29 ++++
 t/t4218-ahead-behind.sh            | 162 ++++++++++++++++++++++
 t/t5318-commit-graph.sh            |   2 +-
 t/t6600-test-reach.sh              | 120 +++++++++++++++++
 14 files changed, 904 insertions(+), 70 deletions(-)
 create mode 100644 Documentation/git-ahead-behind.txt
 create mode 100644 builtin/ahead-behind.c
 create mode 100755 t/perf/p1500-graph-walks.sh
 create mode 100755 t/t4218-ahead-behind.sh


base-commit: d15644fe0226af7ffc874572d968598564a230dd
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1489%2Fderrickstolee%2Fstolee%2Fupstream-ahead-behind-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1489/derrickstolee/stolee/upstream-ahead-behind-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1489
-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux