Re: [PATCH v3 6/8] reset: make sparse-aware (except --mixed)

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 08 Oct 2021 11:31:39 -0700

Victoria Dye <vdye@xxxxxxxxxx> writes:

> Phillip Wood wrote:

>> I was looking at the callers to prime_cache_tree() this morning
>> and would like to suggest an alternative approach - just delete
>> prime_cache_tree() and all of its callers!

Do you mean the calls added by new patches without understanding
what they are doing, or all calls to it?

Every time you update a path in the index from the working tree
(e.g. "git add") and other sources, the directory in the cache-tree
that includes the path is invalidated, and the surviving subtrees of
cache-tree is used to speed up writing the index as a tree object,
doing "diff-index --cached" (hence "git status"), etc.  So over
time, the cache-tree "degrades" as you muck with the index entries.

When you write out the index as a tree, we by definition have to
know the object names of all the tree objects that correspond to
each directory in the index.  A fully valid cache-tree is saved when
it happens, so the above process can start over.

There are cases other than "git write-tree" that we can cheaply
learn the object names of all the tree objects that correspond to
each directory in the index.  When we read the index from an
existing tree object, we know which tree (and its subtrees) we
populated the index from, so we can salvage a degraded cache-tree.

"reset --hard" and "reset --mixed" may be good opportunities, so is
"checkout <branch>" that starts from a clean index.  And cache tree
priming is a mechanism to take advantage of such an opportunity.

The cache-tree does not have to be primed and all you lose is
performance, so priming can be removed mostly "without an issue", if
you are not paying attention to cache-tree degradation.  Priming
with incorrect data, however, would leave permanent damage by
writing a wrong tree via "git write-tree" (hence "git commit") and
showing a wrong diff via "git diff-index [--cached]" (hence "git
status" and probably "git add -- <pathspec>"), so not priming is
safer than priming incorrectly.

HTH.