Re: [PATCH v3 01/14] commit-graph: document commit-graph chains

Philip Oakley <philipoakley@xxxxxxx> · Thu, 6 Jun 2019 22:59:00 +0100

Hi Stolee,

We may be talking at cross-purposes.
On 06/06/2019 18:09, Derrick Stolee wrote:
On 6/6/2019 8:10 AM, Philip Oakley wrote:
Hi Derrick ,

On 03/06/2019 17:03, Derrick Stolee via GitGitGadget wrote:
From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>

Add a basic description of commit-graph chains.
Not really your problem, but I did notice that we don't actually explain what we mean here by a commit graph (before we start chaining them), and the distinction between the generic concept and the specific implementation.
The purpose of my comment is here. We have/had not explained why we need 
(another) commit graph, either within the man page, or the technical 
docs. It's an understanding gap.

If I understand it correctly, the regular DAG (directed acyclic graph) already inherently contains the commit graph, showing the parent(s) of each commit. Hence, why do we need another? (which then needs explaining the what/why/how)

So, in one sense, another commit chain is potentially duplicated redundant data. What hasn't been surfaced (for the reader coming later) is probably that accessing the DAG commit graph can be (a) slow, (b) one way (no child relationships), and (c) accesses large amounts of other data that isn't relevant to the task at hand.

So the commit graph (implementation) is [I think] a fast, compact, sorted(?), list of commit oids that provides two way linkage through the commit graph (?) to allow fast queries within the Git codebase.

The commit graph is normally considered immutable,
_Commits_ are immutable. The graph grows as commits are added.
I was aware that individual commits are immutable. However the tips, 
grafts and replacements can change the topology of the graph (especially 
the grafts and replacements, hence the desire to have something that 
acts as a guide as to what, generally, is trying to be achieved).

This may be the crux of your confusion, since the commit-graph
file can become stale as commits are added by 'git commit' or
'git fetch'. The point of the incremental file format is to
update the commit-graph data without rewriting the entire thing
every time.

Does this help clarify what's going on?
Only slightly, see below.

however the DAG commit graph can be extended by new commits, trimmed by branch deletion, rebasing, forced push, etc, or even reorganised via 'replace' or grafts commits, which must then be reflected in the commit graph (implementation).
These things create new commit objects, which would not be in
the commit-graph file until it is rewritten.

It just felt that there is a gap between the high level DAG, explained in the glossary, and the commit-graph That perhaps the technical/commit-graph.txt ought to summarise.
I do think that technical/commit-graph.txt does summarize a lot
about the commit-graph _file_ and how that accelerates walks on
the high-level DAG. The added content in this patch does assume
a full understanding of the previous contents of that file.
The current (prior) documentation is a bit Catch 22 with regard to that 
assumed full understanding, hence my comment, including the "Not really 
your problem," bit.

Thanks,
-Stolee

Philip