Duy Nguyen <pclouds@xxxxxxxxx> writes: > On Mon, Mar 16, 2015 at 09:05:45AM -0700, Junio C Hamano wrote: >> The offending one came from eec3e7e4 (cache-tree: invalidate i-t-a >> paths after generating trees, 2012-12-16), which was a fix to an >> earlier bug where a cache-tree written out of an index with i-t-a >> entries had incorrect information and still claimed it is fully >> valid after write-tree rebuilt it. The test probably should add >> another path without i-t-a bit, run the same "diff --cached" with >> updated expectation before write-tre, and run the "diff --cached" >> again to make sure it produces a result that match the updated >> expectation. > > Would adding another non-i-t-a entry help? Before this patch > "diff --cached" after write-tree shows the i-t-a entry only when > eec3e7e4 is applied. But with this patch we don't show i-t-a entry any > more, before or after write-tree, eec3e7e4 makes no visible difference. > > We could even revert eec3e7e4 and the outcome of "diff --cached" would > be the same because we just sort of move the "invalidation" part from > cache-tree to do_oneway_diff(). Not invalidating would speed up "diff > --cached" when i-t-a entries are present. Still it may be a good idea > to invalidate i-t-a paths to be on the safe side. Perhaps a patch like > this to resurrect the test? My unerstanding of what eec3e7e4 (cache-tree: invalidate i-t-a paths after generating trees, 2012-12-16) fixed was that in this sequence: - You prepare an index. - You write-tree out of the index, which involves: - updating the cache-tree to match the shape of the resulting from writing the index out. - create tree objects matching all levels of the cache-tree as needed on disk. - report the top-level tree object name - run "diff-index --cached", which can and will take advantage of the fact that everything in a subtree below a known-to-be-valid cache-tree entry does not have to be checked one-by-one. If a cache-tree says "everything under D/ in the index would hash to tree object T" and the HEAD has tree object T at D/, then the diff machinery will bypass the entire section in the index under D/, which is a valid optimization. However, when there is an i-t-a entry, we excluded that entry from the tree object computation, its presence did not contribute to the tree object name, but still marked the cache-tree entries that contain it as valid by mistake. This old bug was what the commit fixed, so an invocation of "diff --cached" after a write-tree, even if the index contains an i-t-a entry, will not see cache-tree entries that are marked valid when they are not. Instead, "diff --cached" will bypass the optimization and makes comparison one-by-one for the index entries. So reverting the fix obviously is not the right thing to do. If the tests show different results from two invocations of "diff --cached" with your patch applied, there is something that is broken by your patch, because the index and the HEAD does not change across write-tree in that test. If on the other hand the tests show the same result from these two "diff --cached" and the result is different from what the test expects, that means your patch changed the world order, i.e. an i-t-a entry used to be treated as if it were adding an empty blob to the index but it is now treated as non-existent, then that is a good thing and the only thing we need to update is what the test expects. I am guessing that instead of expecting dir/bar to be shown, it now should expect no output? Does adding an non-i-t-a entry help? It does not hurt, and it makes the test uses a non-empty output, making its effect more visible, which may or may not count as helping. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html