Re: Rename handling

Steven Grimm <koreth@xxxxxxxxxxxxx> · Mon, 19 Mar 2007 11:14:51 -0700

John Goerzen wrote:
2) For me, a rename is a logical change to the source tree that I want
   to be recorded with absolute certainty, not guessed about later.
   Sometimes I may make API changes and it is useful to see how module
   names changed, with complete precision, later.  I do not want to be
   victim to an incorrect guess, which could be possible.

If you commit your renames separately from your content changes, it'll 
be unambiguous and you won't have to worry about it. That's what I 
usually do when this is a concern and it has yet to break for me.

On the other hand, I agree with your general point; I really don't like 
being uncertain about whether renames are going to come out correctly or 
not ("it has always worked before" and "it is by design unable to fail" 
are two very different things.) In particular, I strongly disagree with 
the "names are just syntactic sugar, it's the content we're tracking" 
philosophy. Here's a simple example of why:

#include <xyz.h>

That simple statement is an intermingling of content and namespace. The 
presence of something like that actually breaks the "commit the rename 
separately" approach -- if you rename xyz.h to something else and commit 
just that rename, that revision won't compile, and I *really* don't like 
intentionally committing broken revisions.

Okay, so you say, rename xyz.h and update the references to it, but 
don't actually modify it. Fine, that works in this case. Now how about 
this one:

public abstract class Foo {
   private static Logger logger = Logger.getLogger(Foo.class.getName());
}

The references to the name "Foo.java" in that case are within the file 
itself (assuming you're using a Java compiler that requires the filename 
and class name to match, which the common ones do.) You can't change 
just the references without changing the file you're renaming. And, 
depending on how many self-references there are in this file, it's 
anyone's guess whether the content-based rename detection will consider 
the renamed file to be close enough to the old one to be a probable rename.

Combine renames with major code refactoring where the content changes 
substantially, and all bets are off.

Now, having said all that, I'll argue in favor of the content-based 
rename support for a moment. It is extremely cool that git will actually 
detect renames in third-party packages where you've just untarred a new 
release into your git repository and committed it, but have given git no 
hints at all about the nature of the content changes. I'm not aware of 
any other version control system that'll do that, and I've taken 
advantage of that feature in the past. So by no stretch am I saying that 
content-based rename detection is worthless.

But I would sure rest a lot easier if "git mv" would record a "the user 
renamed this file" entry in some log somewhere and the merge code would 
see that entry and say, "aha, no need to guess at it, file X got renamed 
to Y." Bonus points if that record could apply to directories too, so 
you don't have the "I created a new file in a directory you renamed, and 
after git-pull my file is still sitting by itself in the old directory" 
bug. If no such record exists, then the current rename code should still 
be invoked to work its considerable magic.

So to answer your question, in my opinion if 100% guaranteed renames are 
high on your priority list, then Mercurial might be the better option 
for now. In practice, I've found that git's 99+% rename detection has 
yet to fail on me aside from the above directory renaming case, but at 
the end of the day it *is* guessing at your renames after the fact.

Okay, git gurus, show me no mercy. :)

-Steve

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html