Dear diary, on Fri, May 05, 2006 at 08:31:06PM CEST, I got a letter where Linus Torvalds <torvalds@xxxxxxxx> said that... > Moving data around happens with a whole lot more than "mv". Let's keep this on the per-file level - if you want to go below the file granularity, I already _DID_ say that I agree that explicit tracking is not a way. (If sub-file tracking would end up having any usable reliability in real-world cases, which is something I do not take for granted.) Another thing is, the sub-file content tracking would end up being a lot more "magic" than the simple per-file content tracking, and you stated several times that you prefer simple merge over better but magic merge - so why do you prefer sub-file content tracking anyway? > It happens with patches (somebody _else_ may have done an "mv", without > using git at all), _Here_ is the place for automated renames detection. Between applying and committing the patch, the user can verify that it got the renames right. That's impossible when guessing the renames later. > and it happens with editors (moving data around until > most of it exists in another file). I doubt this in fact happens that often (to a degree the automatic rename detection would catch). And if it happens, then the user has to tell Git - I have never heard that _this_ would be any problem in other version control systems. You could make it more foolproof by running the automatic rename detection on the diff being committed and suggesting the user that other yet unrecorded renames did happen. The point is, the user stays in control and can override any stupid guess. > So doing "*mv" is just a special case. > > And supporting special cases is _wrong_. If you start depending on data > that isn't actually dependable, that's WRONG. I prefer making this data dependable to having to resort to guessing on dependable less amount of data. > There's another reason why encoding movement information in the commit is > totally broken, namely the fact that a lot of the actions DO NOT WALK THE > COMMIT CHAIN! > > Try doing > > git diff v1.3.0.. > > and think about what that actually _means_. Think about the fact that it > doesn't actually walk the commit chain at all: it diffs the trees between > v1.3.0 and the current one. What if the rename happened in a commit in the > middle? Then the automated renames detection will miss it given that the other accumulated differences are large enough, and the suggested workarounds _are_ precisely walking the commit chain. If you use persistent file ids, you never miss it _AND_ you DO NOT WALK THE COMMIT CHAIN! You still just match file ids in the two trees. > The "track contents, not intentions" approach avoids both these things. > The end result is _reliable_, not a "random guess". No, the end result is whichever some heuristic randomly guessed, and it's not reliable either since the heuristic can change. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Right now I am having amnesia and deja-vu at the same time. I think I have forgotten this before. - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html