[this is a bit of an old message, but I am way behind on git mail, and nobody else seems to have responded, so...] On Sun, May 31, 2009 at 10:28:50PM +0200, Daniel Mierswa wrote: > I was told to try it here after visiting #git/Freenode > I want git to think that the diff of two branches where filenames and > whitespace amount differ are the same. > The following is a snippet from my terminal with output, is there a > chance to make git think that those are equal? Rename detection in git does not respect the "-w" option at all. It hashes each line of a text file, and then compares the hashes to see how "similar" the files are. It already makes some effort to ignore the CR in a CRLF sequence when calculating the hash. So just running "unix2dos" (or vice versa) on a file should still allow it to find renames. This could probably be extended fairly trivially to ignore arbitrary whitespace when generating the hash (I'm not sure if the feature should be triggered by "-w" or not; it makes sense to me, but I'm not sure if there are cases where people would want diff generation to have different rules than rename detection. We maybe would even want to ignore whitespace in diff generation _always_, as we always do already with CRLF. Somebody would need to check the results of the two approaches against a number of cases). If you are interested, the relevant code is in hash_chars in diffcore-delta.c. A trivial implementation would probably look something like the patch below. I tested it with: git init cp /usr/share/dict/words words && git add words && git commit -m one sed 's/^/ /' <words >munged git add munged && git rm words git diff --cached --summary which curious reports 82% similarity. So maybe there is more investigation to be done. Anyway, patch below. --- diff --git a/diffcore-delta.c b/diffcore-delta.c index e670f85..63704da 100644 --- a/diffcore-delta.c +++ b/diffcore-delta.c @@ -145,6 +145,8 @@ static struct spanhash_top *hash_chars(struct diff_filespec *one) /* Ignore CR in CRLF sequence if text */ if (is_text && c == '\r' && sz && *buf == '\n') continue; + if (is_text && (c == ' ' || c == '\t')) + continue; accum1 = (accum1 << 7) ^ (accum2 >> 25); accum2 = (accum2 << 7) ^ (old_1 >> 25); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html