On Sun, 26 Mar 2006, Jakub Narebski wrote: > > If (2) is common enough then discussed improvements to rename detection, > namely comparing basenames as a base for candidate selection is a good idea. BK had this "renametool" which got started automatically when you applied a patch that removed one or more files and added one or more files, so that you could then pair up the files manually. It left the rename pairing 100% to the user, but it helped a bit by guessing what the pairing might be, and yes, it used the basenames to set up that initial guess. It worked in many cases, but it also failed in many cases. I do think it was a useful heuristic within the BK model (since the _real_ choice was left to the user), but I don't think it's very useful for git. The thing is, the fast rename detection that is in the "next" branch really does a lot better, and it's fast enough. (If you wanted to make it even faster, but less precise, you could limit it to the first few kilobytes of file contents - still a _lot_ better heuristic than the actual filename, and it would make the worst-case behaviour much better). > I wonder how common is (2) compared to (1)+(2) i.e. move to other dir > and rename, old-dir/old-file.c to new-dir/new-subdir/new-file.c I don't have any numbers, but from usign renametool for a few years, my gut feel/recollection is that about half of renames in the kernel were moving to a new directory, and about half changed names (often in _addition_ to moving). But I didn't much think about it, so that's just a very rough guess based on using a tool that helped you do these things manually. For example, one common case was a directory structure like .. type-file1.c type-file2.c otherfiles.c yet-more.c .. being split up into a subdirectory .. type/file1.c type/file2.c otherfiles.c yet-more.c .. (eg drivers/scsi/aic7xx-* being given a subdirectory of it's own, as drivers/scsi/aic7xx/*). So the basename wouldn't stay the same, because it contained some piece of data that became redundant with the move. > >> 3.) splitting file into modules, huge-file.c to file1.c, file2.c? > >> 4.) copying fragment of one file to other? > >> 5.) moving fragment of code from one file to other? > > > > I'd say that (5) is very common. And (4) happens a lot under certain > > circumstances (new driver, new architecture, new filesystem..). > > > > Doing (3) happens, but probably less often that it should ;/ > > Detecting (4) and (5) fast (i.e. for merges) without auxilary (helper) > information would probably be hard. For interrogation/porcellanish commands > (like pickaxe) would probably be easier. Yes. I don't think we necessarily want to merge automatically across things like that, even if it sounds like something you'd want in a perfect world. Stupid and obvious (and fails) is often better than smart and complex (and succeeds), because at least you _understand_ what happens. But _following_ a particular change back is important, and should be both efficient and simple to do. Ie the example tool I talked about in http://article.gmane.org/gmane.comp.version-control.git/217 is still relevant and important, I think. I literally think that people wouldn't even _want_ a "git annotate", if they instead had more of a visual tool that showed the current state of the file, and you could click on a line/set of lines to follow it back to the previous change to that area. I'd argue that almost always when you want "annotate", you already have the particular place that you want to look at in mind (you're really not interested in the whole file). So wouldn't it be _much_ nicer to have a "graphical git-whatchanged", where you just delve deeper (and you don't even look at the whole file like git-whatchanged does, but you ask for a very particular region). Ie, what I imagine would be something gitk/qgit like, where you see the file content, select a line or two (or a whole function), and it goes back in history and shows you the last diff that changed that line/two/function. We can do that EFFICIENTLY. Much more efficiently than git-annotate, in fact. And then when you see the diff, you might say "I'm not interested in this one, that was just a re-indent" and then continue back. THAT is the kind of graphical tool I'd want. And dammit, it should even be _easy_. I'm just a total clutz myself when it comes to doing things like QT or nice tcl/tk text-panes, and this really does have to be visual, since the whole point is that "select text" and interactive part. So if somebody wants to be a hero, and feels comfortable with those kinds of things, this really should be a fairly straightforward thing to do (it would be useful even without rename detection or data movement detection, but it's also something where you really _could_ do efficient data movement detection by just looking at the "whole diff" when something changed in that small area). Linus - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html