Re: [PATCH v3 01/14] commit-graph: document commit-graph chains

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Derrick ,

On 03/06/2019 17:03, Derrick Stolee via GitGitGadget wrote:
From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>

Add a basic description of commit-graph chains.
Not really your problem, but I did notice that we don't actually explain what we mean here by a commit graph (before we start chaining them), and the distinction between the generic concept and the specific implementation.

If I understand it correctly, the regular DAG (directed acyclic graph) already inherently contains the commit graph, showing the parent(s) of each commit. Hence, why do we need another? (which then needs explaining the what/why/how)

So, in one sense, another commit chain is potentially duplicated redundant data. What hasn't been surfaced (for the reader coming later) is probably that accessing the DAG commit graph can be (a) slow, (b) one way (no child relationships), and (c) accesses large amounts of other data that isn't relevant to the task at hand.

So the commit graph (implementation) is [I think] a fast, compact, sorted(?), list of commit oids that provides two way linkage through the commit graph (?) to allow fast queries within the Git codebase.

The commit graph is normally considered immutable, however the DAG commit graph can be extended by new commits, trimmed by branch deletion, rebasing, forced push, etc, or even reorganised via 'replace' or grafts commits, which must then be reflected in the commit graph (implementation).

It just felt that there is a gap between the high level DAG, explained in the glossary, and the commit-graph That perhaps the technical/commit-graph.txt ought to summarise.

--
Philip
  More details about the
feature will be added as we add functionality. This introduction gives a
high-level overview to the goals of the feature and the basic layout of
commit-graph chains.

Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
---
  Documentation/technical/commit-graph.txt | 59 ++++++++++++++++++++++++
  1 file changed, 59 insertions(+)

diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
index fb53341d5e..1dca3bd8fe 100644
--- a/Documentation/technical/commit-graph.txt
+++ b/Documentation/technical/commit-graph.txt
@@ -127,6 +127,65 @@ Design Details
    helpful for these clones, anyway. The commit-graph will not be read or
    written when shallow commits are present.
+Commit Graphs Chains
+--------------------
+
+Typically, repos grow with near-constant velocity (commits per day). Over time,
+the number of commits added by a fetch operation is much smaller than the
+number of commits in the full history. By creating a "chain" of commit-graphs,
+we enable fast writes of new commit data without rewriting the entire commit
+history -- at least, most of the time.
+
+## File Layout
+
+A commit-graph chain uses multiple files, and we use a fixed naming convention
+to organize these files. Each commit-graph file has a name
+`$OBJDIR/info/commit-graphs/graph-{hash}.graph` where `{hash}` is the hex-
+valued hash stored in the footer of that file (which is a hash of the file's
+contents before that hash). For a chain of commit-graph files, a plain-text
+file at `$OBJDIR/info/commit-graphs/commit-graph-chain` contains the
+hashes for the files in order from "lowest" to "highest".
+
+For example, if the `commit-graph-chain` file contains the lines
+
+```
+	{hash0}
+	{hash1}
+	{hash2}
+```
+
+then the commit-graph chain looks like the following diagram:
+
+ +-----------------------+
+ |  graph-{hash2}.graph  |
+ +-----------------------+
+	  |
+ +-----------------------+
+ |                       |
+ |  graph-{hash1}.graph  |
+ |                       |
+ +-----------------------+
+	  |
+ +-----------------------+
+ |                       |
+ |                       |
+ |                       |
+ |  graph-{hash0}.graph  |
+ |                       |
+ |                       |
+ |                       |
+ +-----------------------+
+
+Let X0 be the number of commits in `graph-{hash0}.graph`, X1 be the number of
+commits in `graph-{hash1}.graph`, and X2 be the number of commits in
+`graph-{hash2}.graph`. If a commit appears in position i in `graph-{hash2}.graph`,
+then we interpret this as being the commit in position (X0 + X1 + i), and that
+will be used as its "graph position". The commits in `graph-{hash2}.graph` use these
+positions to refer to their parents, which may be in `graph-{hash1}.graph` or
+`graph-{hash0}.graph`. We can navigate to an arbitrary commit in position j by checking
+its containment in the intervals [0, X0), [X0, X0 + X1), [X0 + X1, X0 + X1 +
+X2).
+
  Related Links
  -------------
  [0] https://bugs.chromium.org/p/git/issues/detail?id=8




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux