Re: [PATCH v3 03/20] commit-graph: parse commit from chosen graph

Derrick Stolee <stolee@xxxxxxxxx> · Tue, 29 May 2018 08:31:32 -0400

On 5/27/2018 6:23 AM, Jakub Narebski wrote:
Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:

Before verifying a commit-graph file against the object database, we
need to parse all commits from the given commit-graph file. Create
parse_commit_in_graph_one() to target a given struct commit_graph.
If I understand it properly the problem is that when verifying against
the object database we want to check one single commit-graph file, not
concatenation of data from commit-graph file for the repository and
commit-graph files from its alternates -- like prepare_commit_graph()
does; which is called by parse_commit_in_graph().

Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
O.K., so you introduce here a layer of indirection; parse_commit_in_graph()
now just uses parse_commit_in_graph_one(), passing core_commit_graph
(or the_commit_graph) to it, after checking that core_commit_graph is set
(which handles the case when feature is not turned off) and loading
commit-graph file.

Nice and simple 'split function' refactoring, with new function taking
over when there is commit graph file prepared.


So, after the changes:
* parse_commit_in_graph() is responsible for checking whether to use
   commit-graph feature and ensuring that data from commit-graph is
   loaded, where it passes the control to parse_commit_in_graph_one()
* parse_commit_in_graph_one() checks whether commit-graph feature is
   turned on, whether commit we are interested in was already parsed,
   and then uses fill_commit_in_graph() to actually get the data
* fill_commit_in_graph() gets data out of commit-graph file, extracting
   it from commit data chunk (and if needed large edges chunk).

All those functions return 1 if they got data from commit-graph, and 0
if they didn't.


One minor nitpick / complaint / question is about naming of global
variables used here, namely:
* static struct commit_graph *commit_graph
   from commit-graph.c for global storage of commit-graph[s] data
* int core_commit_graph
   from environment.c for storing core.commitGraph config

But I see that at least the latter is common convention in Git source
code; I guess that the former maybe follows convention as used for "the
index" and "the repository" - additionally it is static / file-local.

See also `struct packed_git *packed_git;` from cache.h.


---
  commit-graph.c | 18 +++++++++++++++---
  1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 82295f0975..78ba0edc80 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -310,7 +310,7 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
  	}
  }
  
-int parse_commit_in_graph(struct commit *item)
+static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
  {
  	uint32_t pos;
  
@@ -318,9 +318,21 @@ static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
  	if (!core_commit_graph)
  		return 0;
All right, we check that commit-graph feature is enabled because
parse_commit_in_graph_one() will be used standalone, not by being
invoked from parse_commit_in_graph().

This check is fast.

  	if (item->object.parsed)
  		return 1;
Sidenote: I just wonder why object.parsed and not for example
object.graph_pos is used to checck whether the object was filled if
possible with commit-graph data...

Here we are filling all data, including commit date and parents. We use 
load_commit_graph_info() when we only need the graph_pos and generation 
values.


+
+	if (find_commit_in_graph(item, g, &pos))
+		return fill_commit_in_graph(item, g, pos);
+
+	return 0;
+}
+
+int parse_commit_in_graph(struct commit *item)
+{
+	if (!core_commit_graph)
+		return 0;
All right, this check is here to short-circuit and make it so git does
not even try to lod commit-graph file[s] if the feature is disabled.

+
  	prepare_commit_graph();
-	if (commit_graph && find_commit_in_graph(item, commit_graph, &pos))
-		return fill_commit_in_graph(item, commit_graph, pos);
+	if (commit_graph)
+		return parse_commit_in_graph_one(commit_graph, item);
  	return 0;
  }