Phillip Wood wrote: > Hi Victoria > > On 07/10/2021 22:15, Victoria Dye via GitGitGadget wrote: >> From: Victoria Dye <vdye@xxxxxxxxxx> >> >> Remove `ensure_full_index` guard on `prime_cache_tree` and update >> `prime_cache_tree_rec` to correctly reconstruct sparse directory entries in >> the cache tree. While processing a tree's entries, `prime_cache_tree_rec` >> must determine whether a directory entry is sparse or not by searching for >> it in the index (*without* expanding the index). If a matching sparse >> directory index entry is found, no subtrees are added to the cache tree >> entry and the entry count is set to 1 (representing the sparse directory >> itself). Otherwise, the tree is assumed to not be sparse and its subtrees >> are recursively added to the cache tree. > > I was looking at the callers to prime_cache_tree() this morning and would like to suggest an alternative approach - just delete prime_cache_tree() and all of its callers! As far as I can see it is only ever called after a successful call to unpack_trees() and since 52fca2184d ("unpack-trees: populate cache-tree on successful merge", 2015-07-28) unpack_trees() updates the cache tree for the caller. All the call sites are pretty obvious apart from the one in t/help/test-fast-rebase.c where unpack_trees() is called by merge_switch_to_result() > It looks like `prime_cache_tree` can be removed mostly without issue, but it causes the two last tests in `t4058-diff-duplicates.sh` to fail. Those tests document failure cases when dealing with duplicate tree entries [1], and it looks like `prime_cache_tree` was creating the appearance of a fully-reset index but was still leaving it in a state where subsequent operations could fail. I'm inclined to say the solution here would be to update the tests to document the "new" failure behavior and proceed with removing `prime_cache_tree`, because: * the test using `git reset --hard` disables `GIT_TEST_CHECK_CACHE_TREE`, indicating that `prime_cache_tree` already wasn't behaving correctly * attempting to fix the overarching issues with duplicate tree entries will substantially delay this patch series * a duplicate entry fix is largely unrelated to the intended scope of the series Another option would be to leave `prime_cache_tree` as it is, but with it being apparently useless outside of mostly-broken use cases in `t4058`, it seems like a waste to keep it around. [1] ac14de13b2 (t4058: explore duplicate tree entry handling in a bit more detail, 2020-12-11)