Re: [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 24 Jun 2022 09:56:16 -0700

Han Xin <hanxin.hx@xxxxxxxxxxxxx> writes:

> If a commit is in the commit graph, we would expect the commit to also
> be present.

Hmph, is that a fundamental requirement, or is that a limitation of
the current implementation?  Naïvely, I do not quite see why we
cannot first partially clone from a remote, access objects that
locally do not exist and lazily fetch them from the promissor, and
then build a reachability graph.  I expect that the resulting commit
graph records the lazily fetched objects at that point.  And then we
should be able to "lose" the lazily fetched objects that we know we
can fetch from the promissor again when we need them in the future.
And we would be in a situation where commits are pruned away, not
locally available in our object store, but can be (re)fetched from
the promisor, no?

> So we should use has_object() instead of
> repo_has_object_file(), which will help us avoid getting into an endless
> loop of lazy fetch.

It all depends on the reason we call lookup_commit_in_graph(), I
think.  Is there an easy way to remember the fact that we are
checking if object X is here with repo_has_object_file(X), so that
an on-demand fetch that happens when X does not locally exist would
not bother checking with lookup_commit_in_graph()?  IOW, temporarily
disable the use of commit-graph when we are lazily fetching?

> When we found the commit in the graph in lookup_commit_in_graph(),
> but the commit is missing from the repository, we will try
> promisor_remote_get_direct() and then enter another loop.  While
> sometimes it will finally succeed because it cannot fork
> subprocess,

Is that a mode of "succeed"-ing?  Or merely a way to exit an endless
loop that does not make any progress with a failure?

> it has exhausted the local process resources and can be harmful to the
> remote service.
>
> Signed-off-by: Han Xin <hanxin.hx@xxxxxxxxxxxxx>
> ---

I think the single-liner change in the patch is a good one, but I am
having a hard time to agree with the reasoning above that explains
why it is a good change.

Here is an attempt to express a reasoning I can understand, can
agree with, and (I think) better describes why the change is a good
one.  Does my understanding of the problem and the solution totally
misses the mark?

	The commit-graph is used to opportunistically optimize
	accesses to certain pieces of information on commit objects,
	and lookup_commit_in_graph() tries to say "no" when the
	requested commit does not locally exist by returning NULL,
	in which case the caller can ask for (which may result in
	on-demand fetching from a promisor remote) and parse the
	commit object itself.

	However, it uses a wrong helper, repo_has_object_file(), to
	do so.  This helper not only checks if an object is
	immediately available in the local object store, but also
	tries to fetch from a promisor remote.  But the fetch
	machinery calls lookup_commit_in_graph(), thus causing an
	infinite loop.

	We should make lookup_commit_in_graph() expect that a commit
	given to it can be legitimately missing from the local
	object store, by using the has_object_file() helper instead.

> diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> new file mode 100755
> index 0000000000..4d25d2c950
> --- /dev/null
> +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh

Hmph, does this short-test need a completely new file?

> @@ -0,0 +1,47 @@
> +#!/bin/sh
> +
> +test_description='test for no lazy fetch with the commit-graph'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup: prepare a repository with a commit' '
> +	git init with-commit &&
> +	test_commit -C with-commit the-commit &&
> +	oid=$(git -C with-commit rev-parse HEAD)
> +'
> +
> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> +	git init with-commit-graph &&
> +	echo "$(pwd)/with-commit/.git/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&
> +	# create a ref that points to the commit in alternates
> +	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> +	# prepare some other objects to commit-graph
> +	test_commit -C with-commit-graph somthing &&

somthing? something?

> +	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> +	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> +'
> +
> +test_expect_success 'setup: change the alternates to what without the commit' '
> +	git init --bare without-commit &&
> +	echo "$(pwd)/without-commit/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&

Doesn't this deliberately _corrupt_ the with-commit-graph repository
that depended on the object whose name is $oid in with-commit
repository?  Do we require a corrupt repository to trigger the "bug"?

> +	test_must_fail git -C with-commit-graph cat-file -e $oid
> +'
> +
> +test_expect_success 'setup: prepare another commit to fetch' '
> +	test_commit -C with-commit another-commit &&
> +	anycommit=$(git -C with-commit rev-parse HEAD)

anycommit?  another_commit?  Be consistent in naming.

> +'
> +
> +test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '

So we did all of the above set-up sequences only to skip the most
interesting test, if we were testing with "dash"?  I suspect that it
may be cleaner to put the prerequisite to the whole file with the
"early test_done" trick like t0051 and t3008.

> +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
> +	git -C with-commit-graph config remote.origin.promisor true &&
> +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
> +	GIT_TRACE="$(pwd)/trace" run_with_limited_processses \
> +		git -C with-commit-graph fetch origin $anycommit 2>err &&
> +	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
> +	test $(grep "fetch origin" trace | wc -l) -eq 1
> +'
> +
> +test_done

Thanks.