On Fri, 16 Mar 2007, Nicolas Pitre wrote: > On Fri, 16 Mar 2007, Linus Torvalds wrote: > > > The most performance-critical objects for uncompression are commits and > > trees. At least for the kernel, the average size of a tree object is 678 > > bytes. And that's ignoring the fact that most of them are then deltified, > > so about 80% of them are likely just a ~60-byte delta. > > This is why in pack v4 there will be an alternate tree object > representation which is not deflated at all. Well, the thing is, for things that really don't compress, zlib shouldn't add much of an overhead on uncompression. It *should* just end up being a single "memcpy()" after you've done: - check the header for size and mode ("plain data") - check the adler checksum (which is *really* nice - we've found real corruption this way!). The adler32 checksumming may sound unnecessary when you already have the SHA1 checksum, but the thing is, we normally don't actually *check* the SHA1 except when doing a full fsck. So I actually like the fact that object unpacking always checks at least the adler32 checksum at each stage, which you get "for free" when you use zlib. So not using compression at all actually not only gets rid of the compression, it gets rid of a good safety valve - something that may not be immediately obvious when you don't think about what all zlib entails. People think of zlib as just compressing, but I think the checksumming is almost as important, which is why it isn't an obviously good thing to not compress small objects just because you don't win on size! Remember: stability and safety of the data is *the* #1 objective here. The git SHA1 checksums guarantees that we can find any corruption, but in every-day git usage, the adler32 checksum is the one that generally would *notice* the corruption and cause us to say "uhhuh, need to fsck". Everything else is totally secondary to the goal of "your data is secure". Yes, performance is a primary goal too, but it's always "performance with correctness guarantees"! But I just traced through a simple 60-byte incompressible zlib thing. It's painful. This should be *the* simplest case, and it should really just be the memcpy and the adler32 check. But: [torvalds@woody ~]$ grep '<inflate' trace | wc -l 460 [torvalds@woody ~]$ grep '<adler32' trace | wc -l 403 [torvalds@woody ~]$ grep '<memcpy' trace | wc -l 59 ie we spend *more* instructions on just the stupid setup in "inflate()" than we spend on the adler32 (or, obviously, on the actual 60-byte memcpy of the actual incompressible data) I dunno. I don't mind the adler32 that much. The rest seems to be pretty annoying, though. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html