Re: [PATCH v2 08/10] commit-graph: handle mixed generation commit chains

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/11/2020 7:36 AM, Abhishek Kumar wrote:
> On Mon, Aug 10, 2020 at 12:42:29PM -0400, Derrick Stolee wrote:
>> On 8/8/2020 10:53 PM, Abhishek Kumar via GitGitGadget wrote:
>>
>> ...
>>
>> Hm. So this scenario actually disables generation numbers completely
>> in the event that anything in the chain disagrees. I think this is
>> not the right way to approach the situation, as it will significantly
>> punish users in this state with slow performance.
>>
>> The patch I sent [1] is probably better: it uses generation number
>> v1 if the tip of the chain does not have a GDAT chunk.
>>
>> [1] https://lore.kernel.org/git/a3910f82-ab2e-bf35-ac43-c30d77f3c96b@xxxxxxxxx/
>>
> 
> Yes, the patch is an clear improvement over my (convoluted and incorrect)
> logic. Will add.
> 
>>
>> ...
>>
>> Please make a point to move the line that checks GIT_TEST_COMMIT_GRAPH_NO_GDAT
>> from its current location to after this line. We want to make sure that the
>> environment variable is checked _last_. The best location is likely the start
>> of the implementation of compute_generation_numbers(), or immediately before
>> the call to the method.
>>
> 
> Sure, will do.
> 
>>
>> ...
>>
>> It would be valuable to double-check here that the values in the GDAT chunk
>> are correct. I'm concerned about the possibility that the 'generation'
>> member of struct commit_graph_data gets filled with topological level during
>> parsing and then that is written as an offset into the CDAT chunk.
>>
>> Perhaps this is best left for a follow-up series that updates the 'verify'
>> subcommand to check the GDAT chunk.
> 
> If I can understand it correctly, one of ways to update 'verify'
> subcommand to check the GDAT chunk as well would to be make use of the
> flag variable introduced in your patch. We can isolate generation number
> related checks and run checks once with flag = 1 (checking corrected
> commit dates) and once with flag = 0 (checking topological levels).
> 
> This has the unfortunate effect of filling all commits twice, but as we
> cannot change the commit_graph_data->generation any other way, I see no
> alternatives without changing how commit_graph_generation() works.
> 
> Would it make more sense if we add the flag to struct commit_graph
> instead of making it depend solely on g->chunk_generation_data and set
> it within parse_commit_graph()?
> 
> We would be able to control the behavior of fill_commit_graph_info() and
> we will not need to check g->chunk_generation_data before filling every
> commit.

I missed that you _already_ updated the logic in verify_commit_graph()
based on the generation. That logic should catch the problem, so it
might be enough to just add some "git commit-graph verify" commands into
your multi-level tests.

Specifically, the end result is this check:

	corrected_commit_date = commit_graph_generation(graph_commit);
	if (corrected_commit_date < max_parent_corrected_commit_date + 1)
		graph_report(_("commit-graph generation for commit %s is %"PRItime" < %"PRItime),
			     oid_to_hex(&cur_oid),
			     corrected_commit_date,
			     max_parent_corrected_commit_date + 1);

This will catch the order violations I was proposing could happen. It
doesn't go the extra mile to ensure that the commit-graph stores the
exact correct value or that the two bits of data are correct (both
topo-level and corrected commit date). That is fine for now, and we
can revisit if necessary.

The diff below makes some tweaks to your split-level test to show the
logic _was_ incorrect without my patch. Please incorporate the test
changes into your series. Note in particular that I added a base
layer that includes the GDAT chunk and _then_ adds a layer without
the GDAT chunk. That is an important case!

Thanks,
-Stolee

--- >8 ---

diff --git a/commit-graph.c b/commit-graph.c
index 17623274d9..d891a8ba3a 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -674,14 +674,6 @@ int generation_numbers_enabled(struct repository *r)
 	if (!g->num_commits)
 		return 0;
 
-	/* We cannot compare topological levels and corrected commit dates */
-	while (g->base_graph) {
-		warning(_("commit-graph-chain contains mixed generation versions"));
-		if ((g->chunk_generation_data == NULL) ^ (g->base_graph->chunk_generation_data == NULL))
-			return 0;
-		g = g->base_graph;
-	}
-
 	first_generation = get_be32(g->chunk_commit_data +
 				    g->hash_len + 8) >> 2;
 
@@ -787,7 +779,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
 	date_low = get_be32(commit_data + g->hash_len + 12);
 	item->date = (timestamp_t)((date_high << 32) | date_low);
 
-	if (g->chunk_generation_data && (flags & COMMIT_GRAPH_GENERATION_V2))
+	if (g->chunk_generation_data)
 		graph_data->generation = item->date +
 			(timestamp_t) get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index);
 	else
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index 1a9be5e656..721515cc23 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -443,6 +443,7 @@ test_expect_success 'setup repo for mixed generation commit-graph-chain' '
 		test_commit $i &&
 		git branch commits/$i || return 1
 	done &&
+	git commit-graph write --reachable --split &&
 	git reset --hard commits/2 &&
 	for i in $(test_seq 6 10)
 	do
@@ -455,14 +456,15 @@ test_expect_success 'setup repo for mixed generation commit-graph-chain' '
 	git reset --hard commits/4 &&
 	git merge commits/6 &&
 	git branch merge/2 &&
-	GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable --split &&
+	GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable --split=no-merge &&
 	test-tool read-graph >output &&
 	cat >expect <<-EOF &&
-	header: 43475048 1 1 3 0
-	num_commits: 12
+	header: 43475048 1 1 4 1
+	num_commits: 7
 	chunks: oid_fanout oid_lookup commit_metadata
 	EOF
-	test_cmp expect output
+	test_cmp expect output &&
+	git commit-graph verify
 '
 
 test_expect_success 'does not write generation data chunk if not present on existing tip' '
@@ -472,23 +474,25 @@ test_expect_success 'does not write generation data chunk if not present on exis
 	git merge commits/5 &&
 	git merge merge/2 &&
 	git branch merge/3 &&
-	git commit-graph write --reachable --split &&
+	git commit-graph write --reachable --split=no-merge &&
 	test-tool read-graph >output &&
 	cat >expect <<-EOF &&
 	header: 43475048 1 1 4 1
 	num_commits: 3
 	chunks: oid_fanout oid_lookup commit_metadata
 	EOF
-	test_cmp expect output
+	test_cmp expect output &&
+	git commit-graph verify
 '
 
 test_expect_success 'writes generation data chunk when commit-graph chain is replaced' '
 	cd "$TRASH_DIRECTORY/mixed" &&
-	git commit-graph write --reachable --split='replace' &&
+	git commit-graph write --reachable --split=replace &&
 	test_path_is_file $graphdir/commit-graph-chain &&
 	test_line_count = 1 $graphdir/commit-graph-chain &&
 	verify_chain_files_exist $graphdir &&
-	graph_read_expect 15
+	graph_read_expect 15 &&
+	git commit-graph verify
 '
 
 test_done



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux