On Fri, 16 Mar 2007, Jeff Garzik wrote: > > Although it sounds like zlib could indeed be optimized to reduce its startup > and shutdown overhead, I wonder if switching compression algorithms to a pure > Huffman or even RLE compression (with associated lower startup/shutdown costs) > would perform better in the face of all those small objects. Well, the thing is, I personally much prefer to have just a single compression algorithm and object layout. Most of the performance-critical objects from a decompression standpoint during commit traversal are all small (especially if you do pathname limiting), but when you do something like a "git add ." most objects are actually random blob objects and you need to have a compression algorithm that works in the general case too. Of course, pack-v4 may (likely will) end up using different strategies for different objects (delta's in particular), but the "one single object compression type" was a big deal for initial implementation. It's may not be fundamental to git operation (so we can fairly easily change it and make it more complex without any higher-level stuff even noticing), but it was definitely fundamental to "get something stable and working" up and running quickly.. > And another random thought, though it may be useless in this thread: I bet > using a pre-built (compiled into git) static zlib dictionary for git commit > and tree objects might improve things a bit. That's kind of pack-v4 area. It will happen, but I'd actually like to see if we can just avoid stupid performance problems with zlib, independently of trying to make more tuned formats. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html