On Tue, Jul 03, 2012 at 11:31:43PM -0700, Junio C Hamano wrote: > Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes: > > > By the way I searched the commit that introduces that check with "git > > log --follow -p builtin/index-pack.c" but I could not find it. What > > did I do wrong? > > Your commit 8a2e163cc shows changes to the file at ll.535-540; these > come from 776ea370 builtin-index-pack.c ll.383-388. > > $ git show 776ea370:builtin-index-pack.c > > The get_data_from_pack() function in that commit gives sufficient > buffer to output side (avail_out starts out as obj->size), and feeds > the data from the packfile in chunks. With the arrangement this > commit makes to call git_inflate(), it should never get stuck > because it ran out of output buffer. In each iteration of the loop, > when the function returns, status should read Z_OK and the function > should have consumed all input. > > But the version that uses consume() function does not give > sufficient output buffer to ensure that the input will always be > inflated fully (avoiding to use large output buffer is the whole > point of your patch after all), so with your patch, that no longer > holds true. Yeah, that makes sense. I was wondering if we could get rid of the avail_in check (and instead just _always_ loop to drain avail_in, whether we are using consume() or not). The intent would be that the loop would be a harmless no-op in the non-consume case, because we would always drain the input completely on the first call. But that's not right; if we _didn't_ drain it, it's probably because the input is malformed (i.e., the deflated data is larger than the object size claims), and we would loop infinitely (because we are not extending the output buffer during each run of the loop). So the patch I posted earlier is the right direction. I didn't properly deal with moving last_out into the inner loop in that patch, but after checking to see where it goes, I'm pretty sure it's superfluous. Here it is with last_out removed and a proper commit message: -- >8 -- Subject: index-pack: loop while inflating objects in unpack_data When the unpack_data function is given a consume() callback, it unpacks only 64K of the input at a time, feeding it to git_inflate along with a 64K output buffer. However, because we are inflating, there is a good chance that the output buffer will fill before consuming all of the input. In this case, we need to loop on git_inflate until we have fed the whole input buffer, feeding each chunk of output to the consume buffer. The current code does not do this, and as a result, will fail the loop condition and trigger a fatal "serious inflate inconsistency" error in this case. While we're rearranging the loop, let's get rid of the extra last_out pointer. It is meant to point to the beginning of the buffer that we feed to git_inflate, but in practice this is always the beginning of our same 64K buffer, because: 1. At the beginning of the loop, we are feeding the buffer. 2. At the end of the loop, if we are using a consume() function, we reset git_inflate's pointer to the beginning of the buffer. If we are not using a consume() function, then we do not care about the value of last_out at all. Signed-off-by: Jeff King <peff@xxxxxxxx> --- builtin/index-pack.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 8b5c1eb..50d3876 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -524,7 +524,6 @@ static void *unpack_data(struct object_entry *obj, stream.avail_out = consume ? 64*1024 : obj->size; do { - unsigned char *last_out = stream.next_out; ssize_t n = (len < 64*1024) ? len : 64*1024; n = pread(pack_fd, inbuf, n, from); if (n < 0) @@ -538,15 +537,19 @@ static void *unpack_data(struct object_entry *obj, len -= n; stream.next_in = inbuf; stream.avail_in = n; - status = git_inflate(&stream, 0); - if (consume) { - if (consume(last_out, stream.next_out - last_out, cb_data)) { - free(inbuf); - free(data); - return NULL; - } - stream.next_out = data; - stream.avail_out = 64*1024; + if (!consume) + status = git_inflate(&stream, 0); + else { + do { + status = git_inflate(&stream, 0); + if (consume(data, stream.next_out - data, cb_data)) { + free(inbuf); + free(data); + return NULL; + } + stream.next_out = data; + stream.avail_out = 64*1024; + } while (status == Z_OK && stream.avail_in); } } while (len && status == Z_OK && !stream.avail_in); -- 1.7.11.rc1.21.g3c8d91e -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html