Victoria Dye <vdye@xxxxxxxxxx> writes: > Phillip Wood wrote: >> I was looking at the callers to prime_cache_tree() this morning >> and would like to suggest an alternative approach - just delete >> prime_cache_tree() and all of its callers! Do you mean the calls added by new patches without understanding what they are doing, or all calls to it? Every time you update a path in the index from the working tree (e.g. "git add") and other sources, the directory in the cache-tree that includes the path is invalidated, and the surviving subtrees of cache-tree is used to speed up writing the index as a tree object, doing "diff-index --cached" (hence "git status"), etc. So over time, the cache-tree "degrades" as you muck with the index entries. When you write out the index as a tree, we by definition have to know the object names of all the tree objects that correspond to each directory in the index. A fully valid cache-tree is saved when it happens, so the above process can start over. There are cases other than "git write-tree" that we can cheaply learn the object names of all the tree objects that correspond to each directory in the index. When we read the index from an existing tree object, we know which tree (and its subtrees) we populated the index from, so we can salvage a degraded cache-tree. "reset --hard" and "reset --mixed" may be good opportunities, so is "checkout <branch>" that starts from a clean index. And cache tree priming is a mechanism to take advantage of such an opportunity. The cache-tree does not have to be primed and all you lose is performance, so priming can be removed mostly "without an issue", if you are not paying attention to cache-tree degradation. Priming with incorrect data, however, would leave permanent damage by writing a wrong tree via "git write-tree" (hence "git commit") and showing a wrong diff via "git diff-index [--cached]" (hence "git status" and probably "git add -- <pathspec>"), so not priming is safer than priming incorrectly. HTH.