Since commit 2f82f760 (Take binary diffs into account for "git rebase"), binary files are included in patch ID computation. Binary files are diffed using the text diff algorithm, however, which has a huge impact on performance. The following tests performance for a 50000 line file marked as binary in .gitattributes. $ git format-patch --stdout --full-index --ignore-if-in-upstream master real 0m0.367s user 0m0.354s sys 0m0.010s Instead of hashing the diff of binary files, use the post-image sha1, which is just as unique. As a result, performance is much improved. $ git format-patch --stdout --full-index --ignore-if-in-upstream master real 0m0.016s user 0m0.015s sys 0m0.001s Signed-off-by: Clemens Buchacher <drizzd@xxxxxx> --- This may be related to the rebase performance issue discussed in the following thread. http://mid.gmane.org/loom.20100713T082913-327@xxxxxxxxxxxxxx I am attaching the script which I used to test performance. diff.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/diff.c b/diff.c index 17873f3..20fc6db 100644 --- a/diff.c +++ b/diff.c @@ -3758,6 +3758,12 @@ static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1) len2, p->two->path); git_SHA1_Update(&ctx, buffer, len1); + if (diff_filespec_is_binary(p->two)) { + len1 = sprintf(buffer, "%s", sha1_to_hex(p->two->sha1)); + git_SHA1_Update(&ctx, buffer, len1); + continue; + } + xpp.flags = 0; xecfg.ctxlen = 3; xecfg.flags = XDL_EMIT_FUNCNAMES; -- 1.7.2.1.1.g202c
Attachment:
test-patchid.sh
Description: Bourne shell script
Attachment:
signature.asc
Description: Digital signature