Duy Nguyen <pclouds@xxxxxxxxx> writes: > Right. I missed this but I think this is a less important test > because I added a new test to make sure "diff --cached" ("git > status" to be exact) outputs the right thing when i-t-a entries > are present. OK. >> If on the other hand the tests show the same result from these two >> "diff --cached" and the result is different from what the test >> expects, that means your patch changed the world order, i.e. an >> i-t-a entry used to be treated as if it were adding an empty blob to >> the index but it is now treated as non-existent, then that is a good >> thing and the only thing we need to update is what the test expects. >> I am guessing that instead of expecting dir/bar to be shown, it now >> should expect no output? > > Yes, no output. Good, as that means your changes did not break things, right? > But still, if I revert the change in cache-tree.c and force write-tree > to produce valid cache-tree when i-t-a entries are present, this test > still passes incorrectly. This is worrysome. Doesn't that suggest that the diff-cache optimization is disabled and cache-tree that is marked as valid (because you "revert" the fix) is not looked at? That is the only explanation that makes sense to me---you write out a tree omitting i-t-a entries in the index and update the index without your earlier fix eec3e7e4 (cache-tree: invalidate i-t-a paths after generating trees, 2012-12-16), i.e. marking the cache-tree entries valid. A later invocation of diff-cache *should* trust that validity and just say "the tree and the index match" without even digging into the individual entries, at which point you should see a bogus result. If that is the case, it would be a performance regression, which may be already there before your patches or something your patches may have introduced. Ah, wait. I suspect that it all cancels out. * You "add -N dir/bar". You write-tree. The tree is written as if dir/bar is not there. cache-tree is updated and it is marked as valid (because you deliberately broke the earlier fix you made at eec3e7e4). * You run "diff-cache". It walks the HEAD and the cache-tree and finds that HEAD:dir and the cache-tree entry for "dir" records the same tree object name, and the cache-tree entry is marked "valid". Hence you declare "no differences". But for the updated definition of "diff-cache", there indeed is no difference. The HEAD and the index matches, because dir/bar does not yet exist. OK, so I think I can see how the result could work without invalidating the cache-tree entry that contains i-t-a entries. It might be even the right thing to do in general. We can view that invalidation a workaround to achieve the old behaviour of diff-cache, which used to report that dir/ are different between HEAD and index. We can even argue that it is the right thing to mark the cache-tree entry for dir/ as valid, no matter how many i-t-a entries are in there, as long as writing out the tree for dir/ out of index results in the same tree as the tree that is stored as HEAD:dir/. And from that viewpoint, keeping the earlier fix (invalidation code) when these patches are applied _is_ introducing a performance bug. Now, as you mentioned, there _may_ be codepaths that uses the same definition of "what's in the index" as what diff-cache used to take before your patches, and they may be broken by removing the invalidation. If we find such codepaths, we should treat their old behaviour as buggy and fix them, instead of reintroducing the invalidation and keep their current behaviour, as the new world order is "i-t-a entries in the index does not yet exist." Thanks for straightening my confusion out. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html