On Tue, 3 Apr 2007, Linus Torvalds wrote: > > So how about this updated patch? We could certainly make "git pull" imply > "--paranoid" if we want to, but even that is likely pretty unnecessary. > It's not like anybody has ever shown a SHA1 collision, and if the *local* > repository is corrupt (and has an object with the wrong SHA1 - that's what > the testsuite checks for), then it's probably good to get the valid object > from the remote.. Some trivial timings for indexing just the kernel pack.. Without --paranoid: 24.61user 2.16system 0:27.04elapsed 99%CPU 0major+14120minor pagefaults With --paranoid: 42.74user 3.04system 0:46.36elapsed 98%CPU 0major+72768minor pagefaults so it's a noticeable CPU issue, but it's even more noticeable in memory usage (55MB vs 284MB - pagefaults give a good way to look at how much memory really got allocated for the process). All that extra memory is just for SHA1 commit ID information. Now, clearly the usage scenario here is a big odd (ie the case where we have all the objects already), so in that sense this is very much a worst-case situation, and you simply shouldn't *do* something like this, but at the same time, I'm just not convinced a very theoretical SHA1 collision check is worth it. Btw, even if we don't have any of the objects, if you have tons and tons of objects and do a "git pull", just the *lookup* of the nonexistent objects will be expensive: first we won't find it in any pack, then we'll look at the loose objects, and then we'll look int he pack *again* due to the race avoidance. So looking up nonexistent objects is actually pretty expensive. In fact, "--paranoid" takes one second more for me even totally outside of a git repository, just because we waste so much time trying to look up non-existent object files ;) Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html