Hi Derrick ,
On 03/06/2019 17:03, Derrick Stolee via GitGitGadget wrote:
From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
Add a basic description of commit-graph chains.
Not really your problem, but I did notice that we don't actually explain
what we mean here by a commit graph (before we start chaining them), and
the distinction between the generic concept and the specific implementation.
If I understand it correctly, the regular DAG (directed acyclic graph)
already inherently contains the commit graph, showing the parent(s) of
each commit. Hence, why do we need another? (which then needs explaining
the what/why/how)
So, in one sense, another commit chain is potentially duplicated
redundant data. What hasn't been surfaced (for the reader coming later)
is probably that accessing the DAG commit graph can be (a) slow, (b) one
way (no child relationships), and (c) accesses large amounts of other
data that isn't relevant to the task at hand.
So the commit graph (implementation) is [I think] a fast, compact,
sorted(?), list of commit oids that provides two way linkage through the
commit graph (?) to allow fast queries within the Git codebase.
The commit graph is normally considered immutable, however the DAG
commit graph can be extended by new commits, trimmed by branch deletion,
rebasing, forced push, etc, or even reorganised via 'replace' or grafts
commits, which must then be reflected in the commit graph (implementation).
It just felt that there is a gap between the high level DAG, explained
in the glossary, and the commit-graph That perhaps the
technical/commit-graph.txt ought to summarise.
--
Philip
More details about the
feature will be added as we add functionality. This introduction gives a
high-level overview to the goals of the feature and the basic layout of
commit-graph chains.
Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
---
Documentation/technical/commit-graph.txt | 59 ++++++++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
index fb53341d5e..1dca3bd8fe 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -127,6 +127,65 @@ Design Details
helpful for these clones, anyway. The commit-graph will not be read or
written when shallow commits are present.
+Commit Graphs Chains
+--------------------
+
+Typically, repos grow with near-constant velocity (commits per day). Over time,
+the number of commits added by a fetch operation is much smaller than the
+number of commits in the full history. By creating a "chain" of commit-graphs,
+we enable fast writes of new commit data without rewriting the entire commit
+history -- at least, most of the time.
+
+## File Layout
+
+A commit-graph chain uses multiple files, and we use a fixed naming convention
+to organize these files. Each commit-graph file has a name
+`$OBJDIR/info/commit-graphs/graph-{hash}.graph` where `{hash}` is the hex-
+valued hash stored in the footer of that file (which is a hash of the file's
+contents before that hash). For a chain of commit-graph files, a plain-text
+file at `$OBJDIR/info/commit-graphs/commit-graph-chain` contains the
+hashes for the files in order from "lowest" to "highest".
+
+For example, if the `commit-graph-chain` file contains the lines
+
+```
+ {hash0}
+ {hash1}
+ {hash2}
+```
+
+then the commit-graph chain looks like the following diagram:
+
+ +-----------------------+
+ | graph-{hash2}.graph |
+ +-----------------------+
+ |
+ +-----------------------+
+ | |
+ | graph-{hash1}.graph |
+ | |
+ +-----------------------+
+ |
+ +-----------------------+
+ | |
+ | |
+ | |
+ | graph-{hash0}.graph |
+ | |
+ | |
+ | |
+ +-----------------------+
+
+Let X0 be the number of commits in `graph-{hash0}.graph`, X1 be the number of
+commits in `graph-{hash1}.graph`, and X2 be the number of commits in
+`graph-{hash2}.graph`. If a commit appears in position i in `graph-{hash2}.graph`,
+then we interpret this as being the commit in position (X0 + X1 + i), and that
+will be used as its "graph position". The commits in `graph-{hash2}.graph` use these
+positions to refer to their parents, which may be in `graph-{hash1}.graph` or
+`graph-{hash0}.graph`. We can navigate to an arbitrary commit in position j by checking
+its containment in the intervals [0, X0), [X0, X0 + X1), [X0 + X1, X0 + X1 +
+X2).
+
Related Links
-------------
[0] https://bugs.chromium.org/p/git/issues/detail?id=8