Nicolas Pitre <nico@xxxxxxxxxxx> writes: > So the idea is to do that once to construct the pack index and allow > for random access once the index is available. Accessing a particular > object without the pack index would be extremely costly otherwise, > especially if it is towards the end of the pack. Thanks for the explanation. It's clear now. > The reason for storing only the expanded data size is to have the > exact buffer size allocated for the inflated data. The zlib stream > that follows is encoded to consume only the needed data to produce the > inflated object. When the output buffer is all used, the zlib library > should flag the end of the deflated stream. If not then there is an > error in the pack data. That provides some error checking, then, as we trust zlib to know when it's had enough input, and we have to trust its assessment on how much is enough, given the lack of delimiting or framing in the packfile format. By the way, I looked over the zlib manualÂ, and I see that many of the inflating/decompressing functions require the caller to specify the number of input bytes available. There is inflateBack() that uses callback functions to request more data upon underflow. The higher-level inflate() function also looks like it can be called in a loop, refilling the input buffer upon underflow. Is Git using one of these two functions here? [...] > When in doubt, the code is always the ultimate source of information. Yes, I need to learn my way around in there to find the call sites relevant to this discussion. Footnotes: Â http://www.zlib.net/manual.html -- Steven E. Harris -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html