Re: pack operation is thrashing my server

Junio C Hamano <gitster@xxxxxxxxx> · Sat, 06 Sep 2008 19:33:24 -0700

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> The reason? 
>
> Right now we depend on "avail_out" also making zlib understand to stop 
> looking at the input stream. Sad, but true - we don't know or care about 
> the compressed size of the object, only the uncompressed size. So in 
> unpack_compressed_entry(), we simply set the output length, and expect 
> zlib to stop when it's sufficient.
>
> Which it does - but the patch kind of violates that whole design.
>
> Now, it so happens that things seem to work, probably because the zlib 
> format does have enough synchronization in it to not try to continue past 
> the end _anyway_, but I think this makes the patch be of debatable value.

I thought the fact we do check the status with Z_STREAM_END means that we
do already expect and rely on zlib to know where the end of input stream
is, and stop there (otherwise we say something fishy is going on and we
error out), and it was part of the design, not just "so happens" and "has
enough synch ... _anyway_".

If input zlib stream were corrupted and it detected the end of stream too
early, then check of "stream.total_out != size" would fail even though we
would see "st == Z_STREAM_END".  If input stream were corrupted and it
went past the end marker, we will read past the end and into some garbage
that is the in-pack header of the next object representation, but zlib
shouldn't go berserk even in that case, and would stop after filling the
slop you allocated in the buffer --- we would detect the situation from
stream.total_out != size and most likely st != Z_STREAM_END in such a
case.

While I think 5% is large enough, I'll leave this on the backburner for
now.  I think it is more grave issue that we inflate the same object many
times as you noticed during the discussion.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html