Hi, On Wed, 5 Aug 2009, Sverre Rabbelier wrote: > On Wed, Aug 5, 2009 at 08:45, Daniel Barkalow<barkalow@xxxxxxxxxxxx> > wrote: > > Someday, I want to have a diff output format that makes these things > > clear somehow. I think it would be not-too-hard to get the diff code > > to determine that an addition matches or almost matches a deletion (or > > some unchanged code), and provide library access to this information, > > but representing it to humans (and getting patch to still work) is > > hard. > > I started on this a while ago (as part of some post-GSoC git-stats > work), but I had a hard time finding a good rule to determine whether an > added hunk is similar enough to a deleted hunk elsewhere. Perhaps a > variant of Levenshtein can be used to determine how different two hunks > are; I tried diffing the two hunks and then looking at the ration > between the size of the diff and the size of the original hunk, but as > said that didn't really work out. I think that there are two complications: - how to present it in a format that helps the human to understand, yet is well-defined enough to be used as a machine-readable edit script (I really do not think that it can be well-designed if it only fulfills one of the two purposes). I could imagine something like this: diff --git a/remote-curl.c b/remote-curl.c partial copy from transport.c +394,130 #ifndef NO_CURL static int curl_transport_push(struct transport *transport, int refspec_nr, const char **refspec, int flags) { [...] } #endif @@ -1,4 +1,6 +#include "cache.h" +#include "transport.h" +#include "refs.h" + #ifndef NO_CURL static int curl_transport_push(struct transport *transport, int refspec_nr, const char **refspec, int flags) - how to determine efficiently where to spend a lot of time to determine what could be similar enough. For example, the diff would still look pretty unreadable if you determined that there was a code move which involved a reindentation, so I am not at all sure if it is worth to try hard to detect that that was a move after all. Ciao, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html