Re: Infinite loop regression in git-fsck in v2.12.0

Jeff King <peff@xxxxxxxx> · Tue, 30 Oct 2018 19:12:32 -0400

On Tue, Oct 30, 2018 at 06:56:03PM -0400, Jeff King wrote:

> > >  	while (total_read <= size &&
> > > +	       stream->avail_in > 0 &&
> > >  	       (status == Z_OK || status == Z_BUF_ERROR)) {
> > >  		stream->next_out = buf;
> > >  		stream->avail_out = sizeof(buf);
> > 
> > Hmph.  If the last round consumed the final input byte and needed
> > output space of N bytes, but only M (< N) bytes of the output space
> > was available, then it would have reduced both avail_in and
> > avail_out down to zero and yielded Z_BUF_ERROR, no?  Or would zlib
> > refrain from consuming that final byte (leaving avail_in to at least
> > one) and give us Z_BUF_ERROR in such a case?
> 
> Hmm, yeah, good thinking. I think zlib could consume that final byte
> into its internal buffer.
> 
> As part of my digging, I looked at how the loose streaming code handles
> this. It checks that when we see Z_BUF_ERROR, we actually did run out of
> output bytes (so if we didn't, then we know it's not the case we
> expected to be looping on).
> 
> I have some patches almost ready to send; I'll use that technique.

And here they are.

  [1/3]: t1450: check large blob in trailing-garbage test
  [2/3]: check_stream_sha1(): handle input underflow
  [3/3]: cat-file: handle streaming failures consistently

 builtin/cat-file.c | 16 ++++++++++++----
 sha1-file.c        |  3 ++-
 t/t1450-fsck.sh    | 23 +++++++++++++++++++++--
 3 files changed, 35 insertions(+), 7 deletions(-)

-Peff