On Mon, Sep 09, 2019 at 02:05:53PM -0700, Junio C Hamano wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > > > Isn't that what is going on? I thought I dug up the original that > > introduced the has_object_file() call to this codepath to make sure > > we understand why we make the check (and I expected the person who > > is proposing this change to do the same and record the finding in > > the proposed log message). > > > > I am running out of time today, and will revisit later this week > > (I'll be down for at least two days starting tomorrow, by the way). > > Here is what I came up with. > > The cache-tree datastructure is used to speed up the comparison > between the HEAD and the index, and when the index is updated by > a cherry-pick (for example), a tree object that would represent > the paths in the index in a directory is constructed in-core, to > see if such a tree object exists already in the object store. > > When the lazy-fetch mechanism was introduced, we converted this > "does the tree exist?" check into an "if it does not, and if we > lazily cloned, see if the remote has it" call by mistake. Since > the whole point of this check is to repair the cache-tree by > recording an already existing tree object opportunistically, we > shouldn't even try to fetch one from the remote. > > Pass the OBJECT_INFO_SKIP_FETCH_OBJECT flag to make sure we only > check for existence in the local object store without triggering the > lazy fetch mechanism. As a third-party observer, that explanation makes sense to me. I wondered also if this means we should be using OBJECT_INFO_QUICK. I.e., do we expect to see a "miss" here often, forcing us to re-scan the packed directory? Reading dd0c34c46b (cache-tree: protect against "git prune"., 2006-04-24), I think the answer is "no". -Peff