Han Xin <hanxin.hx@xxxxxxxxxxxxx> writes: > If a commit is in the commit graph, we would expect the commit to also > be present. Hmph, is that a fundamental requirement, or is that a limitation of the current implementation? Naïvely, I do not quite see why we cannot first partially clone from a remote, access objects that locally do not exist and lazily fetch them from the promissor, and then build a reachability graph. I expect that the resulting commit graph records the lazily fetched objects at that point. And then we should be able to "lose" the lazily fetched objects that we know we can fetch from the promissor again when we need them in the future. And we would be in a situation where commits are pruned away, not locally available in our object store, but can be (re)fetched from the promisor, no? > So we should use has_object() instead of > repo_has_object_file(), which will help us avoid getting into an endless > loop of lazy fetch. It all depends on the reason we call lookup_commit_in_graph(), I think. Is there an easy way to remember the fact that we are checking if object X is here with repo_has_object_file(X), so that an on-demand fetch that happens when X does not locally exist would not bother checking with lookup_commit_in_graph()? IOW, temporarily disable the use of commit-graph when we are lazily fetching? > When we found the commit in the graph in lookup_commit_in_graph(), > but the commit is missing from the repository, we will try > promisor_remote_get_direct() and then enter another loop. While > sometimes it will finally succeed because it cannot fork > subprocess, Is that a mode of "succeed"-ing? Or merely a way to exit an endless loop that does not make any progress with a failure? > it has exhausted the local process resources and can be harmful to the > remote service. > > Signed-off-by: Han Xin <hanxin.hx@xxxxxxxxxxxxx> > --- I think the single-liner change in the patch is a good one, but I am having a hard time to agree with the reasoning above that explains why it is a good change. Here is an attempt to express a reasoning I can understand, can agree with, and (I think) better describes why the change is a good one. Does my understanding of the problem and the solution totally misses the mark? The commit-graph is used to opportunistically optimize accesses to certain pieces of information on commit objects, and lookup_commit_in_graph() tries to say "no" when the requested commit does not locally exist by returning NULL, in which case the caller can ask for (which may result in on-demand fetching from a promisor remote) and parse the commit object itself. However, it uses a wrong helper, repo_has_object_file(), to do so. This helper not only checks if an object is immediately available in the local object store, but also tries to fetch from a promisor remote. But the fetch machinery calls lookup_commit_in_graph(), thus causing an infinite loop. We should make lookup_commit_in_graph() expect that a commit given to it can be legitimately missing from the local object store, by using the has_object_file() helper instead. > diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh > new file mode 100755 > index 0000000000..4d25d2c950 > --- /dev/null > +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh Hmph, does this short-test need a completely new file? > @@ -0,0 +1,47 @@ > +#!/bin/sh > + > +test_description='test for no lazy fetch with the commit-graph' > + > +. ./test-lib.sh > + > +test_expect_success 'setup: prepare a repository with a commit' ' > + git init with-commit && > + test_commit -C with-commit the-commit && > + oid=$(git -C with-commit rev-parse HEAD) > +' > + > +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' ' > + git init with-commit-graph && > + echo "$(pwd)/with-commit/.git/objects" \ > + >with-commit-graph/.git/objects/info/alternates && > + # create a ref that points to the commit in alternates > + git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" && > + # prepare some other objects to commit-graph > + test_commit -C with-commit-graph somthing && somthing? something? > + git -c gc.writeCommitGraph=true -C with-commit-graph gc && > + test_path_is_file with-commit-graph/.git/objects/info/commit-graph > +' > + > +test_expect_success 'setup: change the alternates to what without the commit' ' > + git init --bare without-commit && > + echo "$(pwd)/without-commit/objects" \ > + >with-commit-graph/.git/objects/info/alternates && Doesn't this deliberately _corrupt_ the with-commit-graph repository that depended on the object whose name is $oid in with-commit repository? Do we require a corrupt repository to trigger the "bug"? > + test_must_fail git -C with-commit-graph cat-file -e $oid > +' > + > +test_expect_success 'setup: prepare another commit to fetch' ' > + test_commit -C with-commit another-commit && > + anycommit=$(git -C with-commit rev-parse HEAD) anycommit? another_commit? Be consistent in naming. > +' > + > +test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' ' So we did all of the above set-up sequences only to skip the most interesting test, if we were testing with "dash"? I suspect that it may be cleaner to put the prerequisite to the whole file with the "early test_done" trick like t0051 and t3008. > + git -C with-commit-graph remote add origin "$(pwd)/with-commit" && > + git -C with-commit-graph config remote.origin.promisor true && > + git -C with-commit-graph config remote.origin.partialclonefilter blob:none && > + GIT_TRACE="$(pwd)/trace" run_with_limited_processses \ > + git -C with-commit-graph fetch origin $anycommit 2>err && > + test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err && > + test $(grep "fetch origin" trace | wc -l) -eq 1 > +' > + > +test_done Thanks.