Linus Torvalds <torvalds@xxxxxxxx> writes: > On Wed, 3 Jan 2007, Chris Lee wrote: >> >> So I'm using git 1.4.1, and I have been experimenting with importing >> the KDE sources from Subversion using git-svnimport. > > As one single _huge_ import? All the sub-projects together? I have to say, > that sounds pretty horrid. Thanks -- you said everything I should have said on this issue while I was in bed ;-). > Junio - I suspect "pack-check.c" really shouldn't try to do it as one > single humungous "SHA1_Update()" call. It showed one bug on PPC, I > wouldn't be surprised if it's implicated now on some other architecture. If Chris still has that huge .pack & .idx pair, it would be a very good guinea pig to try a few things on, assuming that this problem is that the pack-check.c feeds a huge blob to SHA-1 function with a single call. (1) Apply the attached patch on top of "master" (the patch should apply to 1.4.1 almost cleanly as well, except that we have hashcmp(a,b) instead of memcmp(a,b,20) since then), and see what it says about the packfile. If your suspicion is correct, it should complain about your SHA-1 implementation. (2) Try tip of "next" to see if its verify-pack passes the check. Again, if your suspicion is correct, it should, since it uses Shawn's sliding mmap() stuff that will not feed the whole pack in one go. (3) I suspect that the tip of "master" should work except verify-pack. It may be interesting to see how well the tip of "master" and "next" performs on the resulting huge pack (say, "time git log -p HEAD >/dev/null"). I am hoping this would be another datapoint to judge the runtime penalty of Shawn's sliding mmap() in "next" -- I suspect the penalty is either negligible or even negative. diff --git a/pack-check.c b/pack-check.c index c0caaee..738a0c5 100644 --- a/pack-check.c +++ b/pack-check.c @@ -29,6 +29,28 @@ static int verify_packfile(struct packed_git *p) pack_base = p->pack_base; SHA1_Update(&ctx, pack_base, pack_size - 20); SHA1_Final(sha1, &ctx); + + if (1) { + SHA_CTX another; + unsigned char *data = p->pack_base; + unsigned long size = pack_size - 20; + const unsigned long batchsize = (1u << 20); + unsigned char another_sha1[20]; + + SHA1_Init(&another); + while (size) { + unsigned long batch = size; + if (batchsize < batch) + batch = batchsize; + SHA1_Update(&another, data, batch); + size -= batch; + data += batch; + } + SHA1_Final(another_sha1, &another); + if (hashcmp(sha1, another_sha1)) + die("Your SHA-1 implementation cannot hash %lu bytes correctly at once", pack_size - 20); + } + if (hashcmp(sha1, (unsigned char *)pack_base + pack_size - 20)) return error("Packfile %s SHA1 mismatch with itself", p->pack_name); - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html