On Thu, 14 Aug 2008, Nicolas Pitre wrote: > > > so most of it is in inflate, > > Which, again, would be eliminated entirely by pack v4. I seriously doubt that. Nico, it's really easy to say "I wave my magic wand and nothing remains". It's hard to actually _do_. > One optimization with pack v4 was to have delta chunks aligned on tree > records, and because tree objects are no longer compressed, parsing a > tree object could be done by simply walking the delta chain directly. Even if you do that, please take a look at the performance characteristics of modern CPU's. Here's a hint: the cost of a cache miss is generally about a hundred times the cost of just about anything else. So to make a convincing argument, you'd have to show that the actual memory access patterns are also much better. No, zlib isn't perfect, and nope, inflate_fast() is no "memcpy()". And yes, I'm sure a pure memcpy would be much faster. But I seriously suspect that a lot of the cost is literally in bringing in the source data to the CPU. Because we just mmap() the whole pack-file, the first access to the data is going to see the cost of the cache misses. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html