On Tue, 3 Apr 2007, Nicolas Pitre wrote: > > > > Yeah. What happens is that inside the repo, because we do all the > > duplicate object checks (verifying that there are no evil hash collisions) > > even after fixing the memory leak, we end up keeping *track* of all those > > objects. > > What do you mean? Look at what we have to do to look up a SHA1 object.. We create all the lookup infrastructure, we don't *just* read the object. The delta base cache is the most obvious one. > I'm of the opinion that this patch is unnecessary. It only helps in > bogus workflows to start with, and it makes the default behavior unsafe > (unsafe from a paranoid pov, but still). And in the _normal_ workflow > it should never trigger. Actually, even in the normal workflow it will do all the extra unnecessary work, if only because the lookup costs of *not* finding the entry. Lookie here: - git index-pack of the *git* pack-file in the v2.6/linux directory (zero overlap of objects) With --paranoid: 2.75user 0.37system 0:03.13elapsed 99%CPU 0major+5583minor pagefaults Without --paranoid: 2.55user 0.12system 0:02.68elapsed 99%CPU 0major+2957minor pagefaults See? That's the *normal* workflow. Zero objects found. 7% CPU overhead from just the unnecessary work, and almost twice as much memory used. Just from the index file lookup etc for a decent-sized project. Now, in the KDE situation, the *unnecessary* lookups will be about ten times more expensive, both on memory and CPU, just because the repository is about 20x the size. Even with no actual hits. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html