Re: git-mv redux: there must be something else going on

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 3, 2010 at 2:23 PM, Ron Garret <ron1@xxxxxxxxxxx> wrote:
> In article
> Ah.  That explains everything.  Thanks.  (I thought git mv was
> equivalent to git rm followed by git add.  But it's not.)

I suppose in this case it's not.  The only difference is when your
work tree differs from your index, though, and it's to be expected
that 'git rm', in removing things from the index, would lose your
ability to track those differences.

> So... how *does* git decide when two blobs are different blobs and when
> they are the same blob with mods?  I asked this question before and was
> pointed to the diffcore docs, but that didn't really clear things up.
> That just describes all the different ways git can do diffs, not the
> actual heuristics that git uses to track content.

If you really want to know the details, looking at the code really is
probably the best solution; it's not even that long.

The short version is that git chooses a set of candidate blobs, then
diffs them and figures out a percentage similarity between each pair.
(A simple way to think of the similarity index is "how long is the
diff compared to the file itself?"  If the diff is of length zero, the
similarity is 100%, and so on.) If the similarity is greater than a
certain threshold, then it's considered to be the same file.

Choosing the set of candidates is actually the more interesting
problem, since detecting moves using the above algorithm is O(n^2)
with the number of candidates.  That's why 'git diff' and 'git log'
don't do it at all by default.

If you provide -M, the set of candidates is the set of files that were
removed/modified and the set of files that were added.  (Added files
are compared against removed/modified files, iirc.)  Normally that's a
very short list.  With -C, you need to compare all
added/removed/modified files with all others, which is slightly more
work.  With --find-copies-harder, it becomes potentially a *lot* of
work.

Have fun,

Avery
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]