Re: BUG: git log: fatal: internal error in diff-resolve-rename-copy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Constantine Plotnikov <constantine.plotnikov@xxxxxxxxx> writes:

> Somewhere between the git 1.7.0.2 and the git 1.7.2.0 the rename
> detection started to fail with fatal error on some files in our
> repository. The bug could be seen on the public IntelliJ IDEA
> repository (about 760M in size), but our users have reported it as
> well.
>
> To reproduce the error, run the following sequence of the commands:
>
> git clone git://git.jetbrains.org/idea/community.git idea
> cd idea
> git log -M --follow --name-only --
> platform/lang-api/src/com/intellij/lang/documentation/CompositeDocumentationProvider.java
>
> As result "fatal: internal error in diff-resolve-rename-copy" is
> written on stderr. This is somewhat unexpected result. Git 1.7.0.2 and
> 1.6.5.2 seems to work without visible problems.

This is interesting.  I actually see another potential "funny" (and that
is why Linus is CC'ed --- scroll down to "funny"), which is unrelated.

diff_tree_sha1() essentially:

 - given two trees, runs straightforward diff-tree without any rename
   detection;

 - under --follow mode, if it has only one change that deletes a path,
   runs try_to_follow_renames(), which:

   - runs the same diff-tree with deep rename detection but without any
     path limiter;

   - if we find a rename or copy that explains why we saw the deletion in
     the first step, replace the deletion record with the rename or copy.
     also set up to follow the old name from now on.

 - then return to the caller.

The diff frontends (diff-tree, diff-files and diff-index) are expected to
leave vanilla filepairs and let the diffcore backend to find renames and
copies via a call to diff_resolve_rename_copy() in diffcore_std().

We however have a special "hack" in "diff-tree --follow".  If it finds
only one diff_queue entry that creates a path, it internally runs itself
again with the rename/copy detection on, without any pathspec, to see if
the creation is an artifact of the pathspec (iow, if there is a source of
rename/copy into the created path), and replaces that creation record with
a rename record in the diff_queue.  When this is done, we do not want the
regular resolve_rename_copy() to kick in (i.e. the "follow" hack already
made a pair).

But what 1da6175 (Make diffcore_std only can run once before a diff_flush,
2010-05-06) did is clearly wrong.  Not wanting to call resolve-rename-copy
does not mean we do not want to run the rest of what diffcore_std() does
at all!  For example, "-S" and "--diff-filter=" options are processed in
that function; the exit status of the command based on the presense of
difference is computed in the function, too.

Another potential "funny" is this (unrelated to the reported issue).

The "--follow" logic is called from diff_tree_sha1() function, but the
input trees to diff_tree_sha1() are not necessarily the top-level trees
(compare_tree_entry() calls it while it recursively descends into
subtrees).  For example, with the example Constantine gave us, the first
"try-to-follow-renames" call happens with the "base" set to "platform/"
but the rename source is actually "lang-api/src/com/intellij/..."
hierarchy, so it is a wasted call.  I think we only want to run the rename
following at the very top level, i.e. like the attached patch.

Linus, what do you think?  Am I missing something obvious?

diff --git a/tree-diff.c b/tree-diff.c
index 1fb3e94..5b68c08 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -412,7 +412,7 @@ int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const cha
 	init_tree_desc(&t1, tree1, size1);
 	init_tree_desc(&t2, tree2, size2);
 	retval = diff_tree(&t1, &t2, base, opt);
-	if (DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename()) {
+	if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename()) {
 		init_tree_desc(&t1, tree1, size1);
 		init_tree_desc(&t2, tree2, size2);
 		try_to_follow_renames(&t1, &t2, base, opt);

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]