> Here's a hint: the cost of a cache miss is generally about a hundred times 100 times seems quite optimistic %) > > No, zlib isn't perfect, and nope, inflate_fast() is no "memcpy()". And > yes, I'm sure a pure memcpy would be much faster. But I seriously suspect > that a lot of the cost is literally in bringing in the source data to the > CPU. Because we just mmap() the whole pack-file, the first access to the > data is going to see the cost of the cache misses. I would have thought that zlib has a sequential access pattern that the CPU prefetchers have a easy time with hiding latency. BTW I always wonder why people reason about cache misses in oprofile logs without actually using the cache miss counters. -Andi -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html