On Sat, Jul 12, 2008 at 12:07 AM, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: >> On Fri, Jul 11, 2008 at 11:22 PM, Johannes Schindelin >> Yeah, I wish 'git log -C -C -M --numstat --sacrifice-chicken >> --pretty=format:%ae --' would take care of that... That is, a git-blame >> like mechanism that would detect such moves on a per-commit basis and >> report them would be very useful to me. > > Well, the chicken (or better, a goat) should be sacrificed by you... The > option I would call "--code-moves". If you suggest I write up a patch to 'git log' I am afraid that would require quite a bit more skill && knowledge of 'git log' than I have (which is about Null :P). > But the semantics of that need to be sorted out in a shell script first; > maybe like I outlined (if that was not coherent, please say so). Python is one big shell script :P, so if you meant that it should be part of GitStats (instead of part of 'git log', which I commented on above), python would be just fine :). The concept was clear enough though, I think I understand what you mean. > Well, it is not a matter of getting it right, but it is a matter of > changes. For example, everytime we move code from one program into the > library, and create a file for that, code changes. <snip> Yes, that's true, with what you described it makes sense :). >> Very much so, but the former I figure can be easily done with 'git log >> -C -C -M' I discovered (I need to parse it's output though, and also >> determine what to do with moves statistics wise. Should changes made >> due to moves just be ignored?) > > That is not very interesting, as we often move so small parts (think "one > function") that -C -C -M does not trigger. Right, why aim for the stuff when there's much more interesting fun out there? If there was a --code-moves I agree with you that it would be a lot more interesting to have than going with the current approach and throwing in '-C -C -M'. >> That sounds interesting, I won't need to actually do that though, I >> already have a diff parser that gives me the lines added VS lines >> deleted on a hunk-by-hunk basis. If it is a true move (e.g., code >> removed in file X and added in file Y) it should be trivial to detect >> that. >> Something along the lines of: >> for hunk in added: >> if hunk in deleted: >> print("Over here!!") > > I think that is not enough, as a code move can mean that part of a > function was refactored into a function. The consequence is often a > reindent, and possibly rewrapping. Mhhh, such would be beyond the scope of implementing manually indeed, and should be left to the likes of a diff tool instead in order to prevent reinventing the wheel :). > And it can mean that some lines have to be inserted here and there. I > still would count that as a code move "with touch-ups". True, true, so it turns out that the most interesting data is the most difficult to mine, how typical. > So I'd like to see something like > > <number-of-commits>: <lines-added> <lines-removed> \ > <lines-moved-from> <lines-moved-to> <filename> Ah, I like the idea of recording moved-from and moved-to seperately instead of ignoring it, why throw away such a perfectly useful statistic. It would be really nice if I could get this data from 'git log' (e.g., the lines-moved-from and lines-moved-to) instead of having to calculate it myself. > BTW I realized something else: your > http://alturin.googlepages.com/full_activity.txt lists only > "gitk-git/po/es.po" under git-git/po/. And it has as many added as > deleted lines. Correct, that's because that is what 'git log' tells me. Have a look at: $ git log --pretty=format:%ae --numstat HEAD -- And grep for "\.po", you'll see that it lists the other po files under "/po/de.po" > So I suspect that "po/*" really lists both gitk's as well as git-gui's .po > files, but merged together. Feasible, if I use '-C -C -M' then the behavior on a directory rename should be to take the found statistics under that directory and move them too. That could be expensive though, what with having to search all the keys whether they are affected and so. -- Cheers, Sverre Rabbelier -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html