On Nov 29, 2007, at 11:44 AM, Linus Torvalds wrote:
On Thu, 29 Nov 2007, Kumar Gala wrote:
I did some git-mv and got the following:
the problem is git seems confused about what file was associated
with its
source.
Well, I wouldn't say "confused". It found multiple identical options
for
the source, and picked the first one (where "first one" may not be
obvious
to a human, it can depend on an internal hash order).
But if you have the resultant git tree somewhere public (or just
send me
the exact "git mv" and revision to recreate), I'll happily give it a
look,
to see if we can improve our heuristics to be closer to what a human
would
expect.
For example, in this case, it looks like there were two totally
identical
"init.S" files that got renamed with the same identical content to
two new
names. YOU seem to expect that it would stay as two renames, but
from a
content angle, since the two sources were identical, it's a totally
arbitrary choice whether it's a "copy one source to two destinations
and
delete the other source" or whether it's two cases of "move one
source to
another destination" (and the latter case also has the issue of
which way
to move it).
(You also had two identical Makefile's with the exact same issue).
So git doesn't care about how you did the rename, it only cares
about the
end result, and the exact same way that it will detect a rename if you
implement it as a "copy file" and then a later "delete old file", it
will
also potentially go the other way, or just decide that identical
contents
moved in different ways.
I was guessing most of this but wanted to make sure there wasn't some
cool feature of git I wasn't aware of.
But we can certainly tweak the heuristics. For example, if we find
multiple identical renames, right now we just pick one fairly at
random,
and have no logic to prefer independent renames over "multiple
copies and
a delete". But this code is actually fairly simple, and with a good
example I can easily add heurstics (for example, it probably *is*
better
to consider it to be two renames, just because the resulting diff
will be
smaller - since a "delete" diff is much larger than a rename diff).
In the case of multiple identical matches can we look at the file name
as a possible heuristic?
- k
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html