On Thu, 30 Nov 2006, Nicholas Allen wrote: > > Does this mean if I have, for example, a large C++ file with a bunch of > methods in it and I move one of the methods from the bottom of the file to the > top and in another branch someone makes a change to that method that when I > merge their changes git will merge their changes into the method at the top of > the file where I have moved it? Right now (and in the near future), nope. "git blame" will track the changes (so the pure movement wasn't just an addition of new code, but you'll see it track it all the way down to the original), but "git merge" is still file-based. In other words, "git merge" does uses a data similarity analysis that could be used for smaller chunks than a whole file, but at least for now it does it on a file granularity only (and then passes it off to the standard RCS three-way merge on a file-by-file basis). That said, if the movement happens _within_ a file, then just about any SCM could do what you ask for, by just using something smarter than the standard 3-way merge. So that part isn't even about tracking data across files - it's just about a per-file merge strategy. The "track data, not files" thing becomes more interesting when you factor out a file into two or more files, and can continue to merge across such a code re-filing event. Git can do it for "annotate", but doesn't do it for anything else. > If so that would be really quite impressive! Indeed, and it's one of the potential future goals that was discussed very early in the git design phase. The point of _not_ doing file ID tracking is exactly that you can actually do better than that by just tracking the data. So some day, we may do it. And not just within one file, but even between files. Because file renames really is just a very specific special case of data movement, and I don't think it's even the most common case. That said, there are several reasons why you might not actually _ever_ want it in practice, and why I say "potential future goal" and "we may do it". I think this is going to be both a matter of not just writing the code (which we haven't done), but also deciding if it's really worth it. Because merges are things where you may not want too much smarts: - Quite often, a failed merge that needs manual fixup may even be _preferable_ to a successful merge that did the merge "technically correctly", but in an unexpected way. - There's a _big_ difference between "merging code" and "examining code". It makes much more sense to try to track where code came from and what the "deep history" was when you examine code, because the reason you're doing so is generally exactly because you're looking for what went wrong, and who to blame. When going "merging", the history of the code is arguably a lot less important. What is the most important part is that the two branches you merge have been (hopefully) verified in their _current_ state. The history may be full of bugs, and they may have been fixed differently, and even trying to be really clever may not actually be a good idea at all. Code may have moved or may have been copied, but what is much more important than the original code and where it came from is the state it was in _after_ the move, because that's the tested working state, and in many ways the history of how it came to be really shouldn't matter as much at all. In other words, "annotate" and "merge" have almost entirely opposite interests. An annotation is supposed to find the history in order to maybe help find bugs, while a merge is supposed to use the _current_ state, and very arguably, if the two current states don't match _so_ obviously that there is no question about what you should do, then the merge should make that very very very clear to the user. So my personal opinion has always been that a merge should be extremely simpleminded. I think all teh VCS people who concentrate on smart merging absolutely have their heads up their arses, and do exactly the wrong thing. A merge should not do anything "clever" at all. It should be just _barely_ smart enough to do the obvious thing, and even then we all know that it will still occasionally do the wrong thing. So I actually think that a bog-standard and totally stupid three-way merge is simply not far from the right thing to do. And the git "recursive" thing basically repeats that stupid merge (a) in time (ie the criss-cross merge thing causes a recursive three-way merge to take place) and (b) in the metadata space (ie you can see the rename following basically as just a "3-way merge in filenames"). And yes, this is probably some mental deficiency and hang-up, but I think that's sufficient, and that where the real "clever" stuff should be is to then help people resolve conflicts (and maybe also help you find mis-merges even with the totally stupid and simple merge). Because conflicts _will_ happen, regardless of your merge strategy, and you do need people to look at them, but you can make it _easier_ for people to say "ok, that's obviously the right merge". So me personally, I'd rather have the "real merge" be what git already does, and then have something like a graphical "resolution helper" application that tries to resolve the remaining things with user help. And that "resolution helper" is where I'd put all the magic code movement logic, not in the merge itself. So you could look at a failed hunk, and press a "show me a best guess" button, and at that point the thing would say "that code might fit here, does that look sane to you? <Ok>, <Next guess>, <Cancel>". THAT is what a good VCS should do, in my opinion. Not do "smart merges". Btw, git doesn't do the above kind of smart graphical thing, but git _does_ do something very much in that direction. Unlike a lot of things, git doesn't just leave the "conflict marker" turds in the working tree. No, the index will contain the three-way merge base and both of the actual files you were trying to merge, and a "git diff" will actually show you a three-way diff of the working tree (and you can say "git diff --ours" to see the diff just against our old head, and "--theirs" to see a regular two-way diff against the _other_ side that you tried to merge). So git already very much embodies this concept of "don't be overly smart when merging, but try to help the user out when resolving the merge". It may not be pretty GUI etc, and it mostly helps with regular bog-standard data conflicts, but boy is it pleasant to use for those once you get used to it. So we get NONE of those horrible "you just get conflict turds, you figure it out" things. It gives you the turds (because people, including me, are used to them, and you want _something_ in the working tree that shows both versions at the same time, of course), but then you can edit them to your hearts content, and even _after_ you've edited them, you can do the above three-way (or two-way against either branch) diffs, and it will show what you edited and its relationship to the two branches you merged. THAT is what merging is all about. Not smart merges. Stupid merges with good tools to help you do the right thing when the right thing isn't _so_ obvious that you can just leave it to the machine. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html