Re: [PATCH v2] cache-tree: do not lazy-fetch merge tree

Jeff King <peff@xxxxxxxx> · Mon, 9 Sep 2019 18:21:01 -0400

On Mon, Sep 09, 2019 at 02:05:53PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@xxxxxxxxx> writes:
> 
> > Isn't that what is going on?  I thought I dug up the original that
> > introduced the has_object_file() call to this codepath to make sure
> > we understand why we make the check (and I expected the person who
> > is proposing this change to do the same and record the finding in
> > the proposed log message).
> >
> > I am running out of time today, and will revisit later this week
> > (I'll be down for at least two days starting tomorrow, by the way).
> 
> Here is what I came up with.
> 
>     The cache-tree datastructure is used to speed up the comparison
>     between the HEAD and the index, and when the index is updated by
>     a cherry-pick (for example), a tree object that would represent
>     the paths in the index in a directory is constructed in-core, to
>     see if such a tree object exists already in the object store.
> 
>     When the lazy-fetch mechanism was introduced, we converted this
>     "does the tree exist?" check into an "if it does not, and if we
>     lazily cloned, see if the remote has it" call by mistake.  Since
>     the whole point of this check is to repair the cache-tree by
>     recording an already existing tree object opportunistically, we
>     shouldn't even try to fetch one from the remote.
> 
>     Pass the OBJECT_INFO_SKIP_FETCH_OBJECT flag to make sure we only
>     check for existence in the local object store without triggering the
>     lazy fetch mechanism.

As a third-party observer, that explanation makes sense to me.

I wondered also if this means we should be using OBJECT_INFO_QUICK.
I.e., do we expect to see a "miss" here often, forcing us to re-scan the
packed directory?

Reading dd0c34c46b (cache-tree: protect against "git prune".,
2006-04-24), I think the answer is "no".

-Peff