Re: [PATCH] name-rev: use generation numbers if available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/28/2022 2:07 PM, Jacob Keller wrote:
> From: Jacob Keller <jacob.keller@xxxxxxxxx>
> 
> If a commit in a sequence of linear history has a non-monotonically
> increasing commit timestamp, git name-rev might not properly name the
> commit.
> 
> This occurs because name-rev uses a heuristic of the commit date to
> avoid searching down tags which lead to commits that are older than the
> named commit. This is intended to avoid work on larger repositories.
> 
> This heuristic impacts git name-rev, and by extension git describe
> --contains which is built on top of name-rev.
> 
> Further more, if --annotate-stdin is used, the heuristic is not enabled
> because the full history has to be analyzed anyways. This results in
> some confusion if a user sees that --annotate-stdin works but a normal
> name-rev does not.
> 
> If the repository has a commit graph, we can use the generation numbers
> instead of using the commit dates. This is essentially the same check
> except that generation numbers make it exact, where the commit date
> heuristic could be incorrect due to clock errors.
> 
> Add a test case which covers this behavior and shows how the commit
> graph makes the name-rev process work.
> 
> Signed-off-by: Jacob Keller <jacob.keller@xxxxxxxxx>
> ---
> The initial implementation of this came from [1]. Should this have Stolee's
> sign-off?
> 
> [1]: https://lore.kernel.org/git/42d2a9fe-a3f2-f841-dcd1-27a0440521b6@xxxxxxxxxx/

I think your implementation is sufficiently different (and better)
that you don't need my co-authorship _or_ sign-off.

> +static void set_commit_cutoff(struct commit *commit)
> +{
> +	timestamp_t generation;
> +
> +	if (cutoff > commit->date)
> +		cutoff = commit->date;
> +
> +	generation = commit_graph_generation(commit);
> +	if (generation_cutoff > generation)
> +		generation_cutoff = generation;
> +}

I appreciate that you split this out into its own method to isolate
the logic.

> +/* Check if a commit is before the cutoff. Prioritize generation numbers
> + * first, but use the commit timestamp if we lack generation data.
> + */
> +static int commit_is_before_cutoff(struct commit *commit)
> +{
> +	if (generation_cutoff < GENERATION_NUMBER_INFINITY)
> +		return commit_graph_generation(commit) < generation_cutoff;
> +
> +	return commit->date < cutoff;
> +}

There are two subtle things going on here when generation_cutoff is
zero:

1. In a commit-graph with topological levels _or_ generation numbers v2,
   commit_graph_generation(commit) will always be positive, so we don't
   need to do the lookup.

2. If the commit-graph was written by an older Git version before
   topological levels were implemented, then the generation of commits
   in the commit-graph are all zero(!). This means that the logic here
   would be incorrect for the "all" case.

The fix for both cases is to return 1 if generation_cutoff is zero:

	if (!generaton_cutoff)
		return 1;
> -	if (start_commit->date < cutoff)
> +	if (commit_is_before_cutoff(start_commit))

> -			if (parent->date < cutoff)
> +			if (commit_is_before_cutoff(parent))

Nice replacements.

> -	if (all || annotate_stdin)
> +	if (all || annotate_stdin) {
> +		generation_cutoff = 0;
>  		cutoff = 0;
> +	}

Good.

> -		if (commit) {
> -			if (cutoff > commit->date)
> -				cutoff = commit->date;
> -		}
> +		if (commit)
> +			set_commit_cutoff(commit);

Another nice replacement.

> +# A-B-C-D-E-main
> +#
> +# Where C has a non-monotonically increasing commit timestamp w.r.t. other
> +# commits
> +test_expect_success 'non-monotonic commit dates setup' '
> +	UNIX_EPOCH_ZERO="@0 +0000" &&
> +	git init non-monotonic &&
> +	test_commit -C non-monotonic A &&
> +	test_commit -C non-monotonic --no-tag B &&
> +	test_commit -C non-monotonic --no-tag --date "$UNIX_EPOCH_ZERO" C &&
> +	test_commit -C non-monotonic D &&
> +	test_commit -C non-monotonic E
> +'
> +
> +test_expect_success 'name-rev with commitGraph handles non-monotonic timestamps' '
> +	test_config -C non-monotonic core.commitGraph true &&
> +	(
> +		cd non-monotonic &&
> +
> +		# Ensure commit graph is up to date
> +		git -c gc.writeCommitGraph=true gc &&

"git commit-graph write --reachable" would suffice here.


> +
> +		echo "main~3 tags/D~2" >expect &&
> +		git name-rev --tags main~3 >actual &&
> +
> +		test_cmp expect actual
> +	)
> +'
> +
> +test_expect_success 'name-rev --all works with non-monotonic' '

This is working because of the commit-graph, right? We still have
it from the previous test, so we aren't testing that this works
when we only have the commit date as a cutoff.

> +	(
> +		cd non-monotonic &&
> +
> +		cat >expect <<-\EOF &&
> +		E
> +		D
> +		D~1
> +		D~2
> +		A
> +		EOF
> +
> +		git log --pretty=%H >revs &&
> +		git name-rev --tags --annotate-stdin --name-only <revs >actual &&
> +
> +		test_cmp expect actual
> +	)

Do you want to include a test showing the "expected" behavior
when there isn't a commit-graph file? You might need to delete
an existing commit-graph (it will exist in the case of
GIT_TEST_COMMIT_GRAPH=1).

I also see that you intended to test the "--all" option, which
is not included in your test. That's probably the real key to
getting this test to work correctly. Deleting the graph will
probably cause a failure on this test unless "--all" is added.

Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux