Hi, Jonathan Tan wrote: > In fetch_pack() (and all functions it calls), pass > OBJECT_INFO_SKIP_FETCH_OBJECT whenever we query an object that could be > a tree or blob that we do not want to be lazy-fetched even if it is > absent. Thus, the only lazy-fetches occurring for trees and blobs are > when resolving deltas. > > Thus, we can remove fetch_if_missing=0 from builtin/fetch.c. Remove > this, and also add a test ensuring that such objects are not > lazy-fetched. (We might be able to remove fetch_if_missing=0 from other > places too, but I have limited myself to builtin/fetch.c in this commit > because I have not written tests for the other commands yet.) Hooray! Thanks much, this looks easier to maintain. > Note that commits and tags may still be lazy-fetched. I limited myself > to objects that could be trees or blobs here because Git does not > support creating such commit- and tag-excluding clones yet, and even if > such a clone were manually created, Git does not have good support for > fetching a single commit (when fetching a commit, it and all its > ancestors would be sent). Is there a place we could put a NEEDSWORK comment to avoid confusion when debugging if this gets introduced later? Even if not, this seems like a sensible choice. > Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx> > --- > I've verified that this also solves the bug explained in: > https://public-inbox.org/git/20191007181825.13463-1-jonathantanmy@xxxxxxxxxx/ Might be worth mentioning the example from there in the commit message as well, to help explain the context behind the change. I would still be in favor of applying that more conservative change to "master", even this late in the -rc cycle. [...] > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -1074,7 +1074,8 @@ static int check_exist_and_connected(struct ref *ref_map) > * we need all direct targets to exist. > */ > for (r = rm; r; r = r->next) { > - if (!has_object_file(&r->old_oid)) > + if (!has_object_file_with_flags(&r->old_oid, > + OBJECT_INFO_SKIP_FETCH_OBJECT)) Yes. [...] > @@ -1755,8 +1756,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) > > packet_trace_identity("fetch"); > > - fetch_if_missing = 0; > - This is the scary part, but in an "uncomfortably exciting" sense rather than a worrying one. Thanks for adding a test. [...] > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -673,7 +673,8 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, > struct object *o; > > if (!has_object_file_with_flags(&ref->old_oid, > - OBJECT_INFO_QUICK)) > + OBJECT_INFO_QUICK | > + OBJECT_INFO_SKIP_FETCH_OBJECT)) Should we make OBJECT_INFO_QUICK always imply OBJECT_INFO_SKIP_FETCH_OBJECT? I would suspect that if we are willing to avoid checking thoroughly locally, checking remotely would be even more undesirable. [...] > --- a/t/t5616-partial-clone.sh > +++ b/t/t5616-partial-clone.sh > @@ -296,6 +296,75 @@ test_expect_success 'partial clone with unresolvable sparse filter fails cleanly > test_i18ngrep "unable to parse sparse filter data in" err > ' > > +setup_triangle () { > + rm -rf big-blob.txt server client promisor-remote && > + > + touch big-blob.txt && Tests seem to prefer spelling this as >big-blob.txt && because that specifes the content of the file. > + for i in $(seq 1 100) > + do > + echo line $i >>big-blob.txt > + done && Should this use test_seq for better portability? nit: can avoid a subshell: test_seq 1 100 | sed -e 's/^/line /' >big-blob.txt [...] > +test_expect_success 'fetch lazy-fetches only to resolve deltas' ' > + setup_triangle && > + > + # Exercise to make sure it works. Git will not fetch anything from the > + # promisor remote other than for the big blob (because it needs to > + # resolve the delta). > + GIT_TRACE_PACKET="$(pwd)/trace" git -C client \ > + fetch "file://$(pwd)/server" master && > + > + # Verify the assumption that the client needed to fetch the delta base > + # to resolve the delta. > + git hash-object big-blob.txt >hash && > + grep "want $(cat hash)" trace nit: can avoid using cat: hash=$(git hash-object big-blob.txt) && grep "want $hash" trace Thanks and hope that helps, Jonathan