> > fetch_if_missing is set to 0 too early - ".gitmodules" here should be > > lazily fetched. Git must set fetch_if_missing to 0 before the fetch > > because as part of the fetch, packfile negotiation happens (and we do > > not want to fetch any missing objects when checking existence of > > objects)... > > Is it only me to feel that this is piling band-aids on top of > band-aids? To me, this is moving a band-aid, not adding another one. But to the bigger point... > Perhaps the addition (and enabling) of lazy-fetch should have been > done after "checking existence" calls are vetted and sifted into two > categories? Some accesses to objects are "because we need it > now---so let's lazily fetch if that is available as a fallback > option to us", as opposed to "if we do not have it locally right > now, we want to know the fact". And each callsite should be able to > declare for what reason between the two it is making the access. > > The single fetch-if-missing boolean may have been a quick-and-dirty > way to get the ball rolling, but perhaps the codebase grew up enough > so that it is time to wean off of it? This is one of the alternatives I mentioned in the original email [1] after "---". But to elaborate on that, I prefer sifting regions over sifting calls. Sifting calls into two categories might work, but it's error-prone in that we would have to do the same line-by-line analysis we did when we added the repository argument to many functions, and we would have to modify functions like parse_commit() to take a flag similar to OBJECT_INFO_SKIP_FETCH_OBJECT. Also, we would have to do the same careful inspection for future patches. Instead, we can control whether a region of code lazy-fetches by moving fetch_if_missing to the struct repository object and using a struct repository that has fetch_if_missing==0 when we don't want lazy fetching. Besides being less error-prone, the infrastructure for this has already been built (if I remember correctly, for submodules, at first). The microproject to put fetch_if_missing into struct repository and my Outreachy project 'Refactor "git index-pack" logic into library code' [2] are steps towards this goal. (I don't think that the latter is strictly necessary, but it will make things simpler. In particular, passing "-C" then the_repository->gitdir to index-pack makes a few tests fail - not many, but not zero; and even after we resolve that, we would need CLI parameters ensuring that we marshal everything we need from the_repository, including fetch_if_missing. It seems better to libify index-pack first.) [1] https://public-inbox.org/git/20191007181825.13463-1-jonathantanmy@xxxxxxxxxx/ [2] https://www.outreachy.org/apply/project-selection/#git