Re: [BUG] serious inflate inconsistency on master

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 03, 2012 at 11:31:43PM -0700, Junio C Hamano wrote:

> Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes:
> 
> > By the way I searched the commit that introduces that check with "git
> > log --follow -p builtin/index-pack.c" but I could not find it. What
> > did I do wrong?
> 
> Your commit 8a2e163cc shows changes to the file at ll.535-540; these
> come from 776ea370 builtin-index-pack.c ll.383-388.
> 
>   $ git show 776ea370:builtin-index-pack.c
> 
> The get_data_from_pack() function in that commit gives sufficient
> buffer to output side (avail_out starts out as obj->size), and feeds
> the data from the packfile in chunks.  With the arrangement this
> commit makes to call git_inflate(), it should never get stuck
> because it ran out of output buffer.  In each iteration of the loop,
> when the function returns, status should read Z_OK and the function
> should have consumed all input.
> 
> But the version that uses consume() function does not give
> sufficient output buffer to ensure that the input will always be
> inflated fully (avoiding to use large output buffer is the whole
> point of your patch after all), so with your patch, that no longer
> holds true.

Yeah, that makes sense. I was wondering if we could get rid of the
avail_in check (and instead just _always_ loop to drain avail_in,
whether we are using consume() or not). The intent would be that the
loop would be a harmless no-op in the non-consume case, because we would
always drain the input completely on the first call.

But that's not right; if we _didn't_ drain it, it's probably because the
input is malformed (i.e., the deflated data is larger than the object
size claims), and we would loop infinitely (because we are not extending
the output buffer during each run of the loop).

So the patch I posted earlier is the right direction. I didn't properly
deal with moving last_out into the inner loop in that patch, but after
checking to see where it goes, I'm pretty sure it's superfluous. Here it
is with last_out removed and a proper commit message:

-- >8 --
Subject: index-pack: loop while inflating objects in unpack_data

When the unpack_data function is given a consume() callback,
it unpacks only 64K of the input at a time, feeding it to
git_inflate along with a 64K output buffer.  However,
because we are inflating, there is a good chance that the
output buffer will fill before consuming all of the input.
In this case, we need to loop on git_inflate until we have
fed the whole input buffer, feeding each chunk of output to
the consume buffer.

The current code does not do this, and as a result, will
fail the loop condition and trigger a fatal "serious inflate
inconsistency" error in this case.

While we're rearranging the loop, let's get rid of the
extra last_out pointer. It is meant to point to the
beginning of the buffer that we feed to git_inflate, but in
practice this is always the beginning of our same 64K
buffer, because:

  1. At the beginning of the loop, we are feeding the
     buffer.

  2. At the end of the loop, if we are using a consume()
     function, we reset git_inflate's pointer to the
     beginning of the buffer.  If we are not using a
     consume() function, then we do not care about the value
     of last_out at all.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 builtin/index-pack.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8b5c1eb..50d3876 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -524,7 +524,6 @@ static void *unpack_data(struct object_entry *obj,
 	stream.avail_out = consume ? 64*1024 : obj->size;
 
 	do {
-		unsigned char *last_out = stream.next_out;
 		ssize_t n = (len < 64*1024) ? len : 64*1024;
 		n = pread(pack_fd, inbuf, n, from);
 		if (n < 0)
@@ -538,15 +537,19 @@ static void *unpack_data(struct object_entry *obj,
 		len -= n;
 		stream.next_in = inbuf;
 		stream.avail_in = n;
-		status = git_inflate(&stream, 0);
-		if (consume) {
-			if (consume(last_out, stream.next_out - last_out, cb_data)) {
-				free(inbuf);
-				free(data);
-				return NULL;
-			}
-			stream.next_out = data;
-			stream.avail_out = 64*1024;
+		if (!consume)
+			status = git_inflate(&stream, 0);
+		else {
+			do {
+				status = git_inflate(&stream, 0);
+				if (consume(data, stream.next_out - data, cb_data)) {
+					free(inbuf);
+					free(data);
+					return NULL;
+				}
+				stream.next_out = data;
+				stream.avail_out = 64*1024;
+			} while (status == Z_OK && stream.avail_in);
 		}
 	} while (len && status == Z_OK && !stream.avail_in);
 
-- 
1.7.11.rc1.21.g3c8d91e

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]