On Wed, Jul 4, 2012 at 5:40 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Jeff King <peff@xxxxxxxx> writes: > >> The problem seems to be in index-pack.c:unpack_data, which does this: >> >>> git_inflate_init(&stream); >>> stream.next_out = data; >>> stream.avail_out = consume ? 64*1024 : obj->size; >>> >>> do { >>> unsigned char *last_out = stream.next_out; >>> ssize_t n = (len < 64*1024) ? len : 64*1024; >>> n = pread(pack_fd, inbuf, n, from); >>> if (n < 0) >>> die_errno(_("cannot pread pack file")); >>> if (!n) >>> die(Q_("premature end of pack file, %lu byte missing", >>> "premature end of pack file, %lu bytes missing", >>> len), >>> len); >>> from += n; >>> len -= n; >>> stream.next_in = inbuf; >>> stream.avail_in = n; >>> status = git_inflate(&stream, 0); >>> if (consume) { >>> if (consume(last_out, stream.next_out - last_out, cb_data)) { >>> free(inbuf); >>> free(data); >>> return NULL; >>> } >>> stream.next_out = data; >>> stream.avail_out = 64*1024; >>> } >>> } while (len && status == Z_OK && !stream.avail_in); >>> >>> /* This has been inflated OK when first encountered, so... */ >>> if (status != Z_STREAM_END || stream.total_out != obj->size) >>> die(_("serious inflate inconsistency")); > > Yeah, that "if (consume)" part is clearly bogus. It should feed > what it read in "inbuf" fully to git_inflate() in a (missing) inner > loop and keep feeding consume() with the inflated data, but instead > happily goes on to read more from the packfile once the initial part > of the inflated data is fed to consume, ignoring the remainder. Agreed. >> So I don't really understand what this !stream.avail_in is doing there >> in the do-while loop. Don't we instead need to have an inner loop that >> keeps feeding the result of pread into git_inflate until we don't have >> any available data left? > > Exactly. I do not think the avain_in check should be done at the > end of the outer loop at all. When we are buffering the entire > inflated data in core without using consume, we allocate enough > memory to "data" to hold the whole thing, so in that case, it may be > OK to expect that git_inflate() would never return without consuming > the input bytes (i.e. stream.avail_in would always be zero at the > site of the check), but with consume(), we give small piece of > memory as a temporary output area and call git_inflate(), and it is > entirely possible and normal for it to run out of output before > inflating all the input bytes. Maybe the !stream.avail_in check is for safety? obj->size could be wrong due to an index-pack bug, leading to insufficient avail_out.. By the way I searched the commit that introduces that check with "git log --follow -p builtin/index-pack.c" but I could not find it. What did I do wrong? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html