Re: [WIP PATCH] Manual rename correction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 01, 2012 at 08:10:12AM +0700, Nguyen Thai Ngoc Duy wrote:

> > I do not think that is the right direction. Let's imagine that I have a
> > commit "A" and I annotate it (via notes or whatever) to say "between
> > A^^{tree} and A^{tree}, foo.c became bar.c". That will help me when
> > doing "git show" or "git log". But it will not help me when I later try
> > to merge "A" (or its descendent). In that case, I will compute the diff
> > between "A" and the merge-base (or worse, some descendent of "A" and the
> > merge-base), and I will miss this hint entirely.
> >
> > A much better hint is to annotate pairs of sha1s, to say "do not bother
> > doing inexact rename correlation on this pair; I promise that they have
> > value N".
> 
> I haven't had time to think it through yet but I throw my thoughts in
> any way. I actually went with your approach first. But it's more
> difficult to control the renaming. Assume we want to tell git to
> rename SHA-1 "A" to SHA-1 "B". What happens if we have two As in the
> source tree and two Bs in the target tree? What happens if two As and
> one B, or one A and two Bs? What if a user defines A -> B and A -> C,
> and we happen to have two As in source tree and B and C in target
> tree?

Yes, it disregards path totally. But if you had the exact same movement
of content from one path to another in one instance, and it is
considered a rename, wouldn't it also be a rename in a second instance?

> There's also the problem with transferring this information. With
> git-notes I think I can transfer it (though not automatically). How do
> we transfer sha1 map (that you mentioned in the commit generation mail
> in this thread)?

That is orthogonal to the issue of what is being stored. I chose my
mmap'd disk implementation because it is very fast, which makes it nice
for a performance cache. But you could store the same thing in git-notes
(indexed by dst sha1, I guess, and then pointing to a blob of (src,
score) pairs.

If you want to include path-based hints in a commit, I'd say that using
some micro-format in the commit message would be the simplest thing. But
that has been discussed before; ultimately the problem is that it only
covers _one_ diff that we do with that commit (it is probably the most
common, of course, but it doesn't cover them all).

> > Then it will find that pair no matter which trees or commits
> > are being diffed, and it will do so relatively inexpensively[1].
> 
> But does that happen often in practice? I mean diff-ing two arbitrary
> trees and expect rename correction. I disregarded it as "git log" is
> my main case, but I'm just a single user..

It happens every time merge-recursive does rename detection, which
includes "git merge" but also things like "cherry-pick".

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]