Re: cleaner/better zlib sources?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds wrote:
The normal size for the performance-critical git objects are in the couple of *hundred* bytes. Not kilobytes, and not megabytes.

The most performance-critical objects for uncompression are commits and trees. At least for the kernel, the average size of a tree object is 678 bytes. And that's ignoring the fact that most of them are then deltified, so about 80% of them are likely just a ~60-byte delta.


Ahhh. At least for me, that explains a lot. Rather than spending all its time in inflate_fast(), git is dealing with lots of zlib startup/shutdown overhead.

Although it sounds like zlib could indeed be optimized to reduce its startup and shutdown overhead, I wonder if switching compression algorithms to a pure Huffman or even RLE compression (with associated lower startup/shutdown costs) would perform better in the face of all those small objects.

And another random thought, though it may be useless in this thread: I bet using a pre-built (compiled into git) static zlib dictionary for git commit and tree objects might improve things a bit.

	Jeff


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]