Re: impure renames / history tracking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 1 Mar 2006, Paul Jakma wrote:
> 
> FWIW, I think git's rename handling is really nice. It's just I suspect, being
> a heuristic, it won't be able to follow history reliably across 'very impure'
> renames.

The thing is, it does better than anything that _tries_ to be "reliable".

I can pretty much _guarantee_ that you can't do it better.

Tracking "inodes" - aka file identities - (which is what BK does, and I 
assume what SVN does) is fundamentally problematic. I particular, it's a 
horrible problem when two inodes "meet" under the same name. You now have 
two identities for the same file, and you're fundamentally screwed.

And don't tell me it doesn't happen. It _does_ happen, and it did happen 
with the kernel under BK.

It doesn't even need renames to be a problem. JUST THE FACT THAT YOU TRY 
TO TRACK FILE "IDENTITY" HISTORY IS BROKEN. For example, take CVS, which 
doesn't actually try to do renames, but _does_ try to track the identity 
of a file, since all the history is tied into that identity: think about 
what happens in Attic when a file is deleted. Completely broken model.

Now, CVS doesn't tend to show the problems very much, because people don't 
actually use branches that much (they are a pain in the neck), and they 
sure as hell try to avoid deleting and creating the same filename under a 
branch and on HEAD. I'm sure you can do it, but I'm also pretty sure 
there's a lot of old projects around that have ended up moving the ,v 
files around to play rename/delete games.

And that's really fundamental. CVS doesn't show the problems so much, 
because CVS actively tries to make it hard to do these things.

With renames-tracking-file-identities, it's _really_ easy to get some 
major confusion going. What happens when one branch creates a file, and 
another one renames a file to that same name, and they merge?

Don't tell me it doesn't happen. It happened under BK. The way BK "solved" 
it was to keep the two separate identities: one of them got resolved to 
the new filename, the other one went into the "deleted" directory. Guess 
what happens when the side that got merged into "deleted" continues to 
edit the file? That's right - their edits happen on the deleted file, and 
never show up in the real tree in a subsequent merge ever again.

And as far as I can tell, BK really did the best you can do. Following 
file identities really _is_ fundamentally broken. It sounds like a nice 
idea, but while you migth solve a few problems, you create a whole raft of 
much more fundamental problems.

So next time you think about a merge that migt have been improved by 
tracking renames, please also think about a merge where one of the 
filenames came from two or more different sources through an earlier 
merge, and thank your benevolent Gods that they instructed me to make git 
be based purely on file contents.

		Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]