Re: [PATCH v2] diff-lib.c: adjust position of i-t-a entries in diff

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 17 Mar 2015 10:57:06 -0700

Duy Nguyen <pclouds@xxxxxxxxx> writes:

> On Mon, Mar 16, 2015 at 09:05:45AM -0700, Junio C Hamano wrote:
>> The offending one came from eec3e7e4 (cache-tree: invalidate i-t-a
>> paths after generating trees, 2012-12-16), which was a fix to an
>> earlier bug where a cache-tree written out of an index with i-t-a
>> entries had incorrect information and still claimed it is fully
>> valid after write-tree rebuilt it.  The test probably should add
>> another path without i-t-a bit, run the same "diff --cached" with
>> updated expectation before write-tre, and run the "diff --cached"
>> again to make sure it produces a result that match the updated
>> expectation.
>
> Would adding another non-i-t-a entry help? Before this patch
> "diff --cached" after write-tree shows the i-t-a entry only when
> eec3e7e4 is applied. But with this patch we don't show i-t-a entry any
> more, before or after write-tree, eec3e7e4 makes no visible difference.
>
> We could even revert eec3e7e4 and the outcome of "diff --cached" would
> be the same because we just sort of move the "invalidation" part from
> cache-tree to do_oneway_diff(). Not invalidating would speed up "diff
> --cached" when i-t-a entries are present. Still it may be a good idea
> to invalidate i-t-a paths to be on the safe side. Perhaps a patch like
> this to resurrect the test?

My unerstanding of what eec3e7e4 (cache-tree: invalidate i-t-a paths
after generating trees, 2012-12-16) fixed was that in this sequence:

    - You prepare an index.

    - You write-tree out of the index, which involves:

      - updating the cache-tree to match the shape of the resulting
        from writing the index out.

      - create tree objects matching all levels of the cache-tree as
        needed on disk.

      - report the top-level tree object name

   - run "diff-index --cached", which can and will take advantage of
     the fact that everything in a subtree below a known-to-be-valid
     cache-tree entry does not have to be checked one-by-one.  If a
     cache-tree says "everything under D/ in the index would hash to
     tree object T" and the HEAD has tree object T at D/, then the
     diff machinery will bypass the entire section in the index
     under D/, which is a valid optimization.

     However, when there is an i-t-a entry, we excluded that entry
     from the tree object computation, its presence did not
     contribute to the tree object name, but still marked the
     cache-tree entries that contain it as valid by mistake.  This
     old bug was what the commit fixed, so an invocation of "diff
     --cached" after a write-tree, even if the index contains an
     i-t-a entry, will not see cache-tree entries that are marked
     valid when they are not.  Instead, "diff --cached" will bypass
     the optimization and makes comparison one-by-one for the index
     entries.

So reverting the fix obviously is not the right thing to do.  If the
tests show different results from two invocations of "diff --cached"
with your patch applied, there is something that is broken by your
patch, because the index and the HEAD does not change across
write-tree in that test.

If on the other hand the tests show the same result from these two
"diff --cached" and the result is different from what the test
expects, that means your patch changed the world order, i.e. an
i-t-a entry used to be treated as if it were adding an empty blob to
the index but it is now treated as non-existent, then that is a good
thing and the only thing we need to update is what the test expects.
I am guessing that instead of expecting dir/bar to be shown, it now
should expect no output?

Does adding an non-i-t-a entry help?  It does not hurt, and it makes
the test uses a non-empty output, making its effect more visible,
which may or may not count as helping.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html