[PATCH] merge_blobs: use strbuf instead of manually-sized mmfile_t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 15, 2016 at 10:39:39PM +0100, Stefan Frühwirth wrote:

> in one specific circumstance, git-merge-tree exits with a segfault caused by
> "*** Error in `git': malloc(): memory corruption (fast)":
> 
> There has to be at least one commit first (as far as I can tell it doesn't
> matter what content). Then create a tree containing a file with a leading
> newline character (\n) followed by some random string, and another tree with
> a file containing a string without leading newline. Now merge trees:
> Segmentation fault.
> 
> There is a test case[1] kindly provided by chrisrossi, which he crafted
> after I discovered the problem[2] in the context of Pylons/acidfs.

Wow, I had to look up what "git merge-tree" even is. It looks like a
proof-of-concept added by 492e075 (Handling large files with GIT,
2006-02-14) that has somehow hung around forever.

I find some of the merging code there questionable, and I wonder if
people are actually using it.  And yet there is this report, and it has
received one or two fixes over the years. So maybe people are.

Anyway, here is an immediate fix for the memory corruption. I'm pretty
sure the _result_ is still buggy in this case, as explained below. I
suspect this weird add/add case should just be a full conflict (like it
is for the normal merge code), and we should just be using ll_merge()
directly. But I have to admit I have very little desire to think hard on
this crufty code. My first preference would be to remove it, but I don't
want to hurt people who might actually be using it. But they can do
their own hard-thinking.

-- >8 --
Subject: merge_blobs: use strbuf instead of manually-sized mmfile_t

The ancient merge_blobs function (which is used nowhere
except in the equally ancient git-merge-tree, which does
not itself seem to be called by any modern git code), tries
to create a plausible base object for an add/add conflict by
finding the common parts of the "ours" and "theirs" blobs.
It does so by calling xdiff with XDIFF_EMIT_COMMON, and
stores the result in a buffer that is as big as the smaller
of "ours" and "theirs".

In theory, this is right; we cannot have more common content
than is in the smaller of the two blobs. But in practice,
xdiff may give us more: if neither file ends in a newline,
we get the "\nNo newline at end of file" marker.

This is somewhat of a bug in itself (the "no newline" string
becomes part of the blob output!), but much worse is that we
may overflow our output buffer with this string (if the
common content was otherwise close to the size of the
smaller blob).

The minimal fix for the memory corruption is to size the
buffer appropriately. We could do so by manually adding in
an extra 29 bytes for the "no newline" string to our buffer
size. But that's somewhat fragile. Instead, let's replace
the fixed-size output buffer with a strbuf which can grow as
necessary.

Reported-by: Stefan Frühwirth <stefan.fruehwirth@xxxxxxxxxxx>
Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 merge-blobs.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/merge-blobs.c b/merge-blobs.c
index ddca601..acfd110 100644
--- a/merge-blobs.c
+++ b/merge-blobs.c
@@ -51,19 +51,16 @@ static void *three_way_filemerge(const char *path, mmfile_t *base, mmfile_t *our
 static int common_outf(void *priv_, mmbuffer_t *mb, int nbuf)
 {
 	int i;
-	mmfile_t *dst = priv_;
+	struct strbuf *dst = priv_;
 
-	for (i = 0; i < nbuf; i++) {
-		memcpy(dst->ptr + dst->size, mb[i].ptr, mb[i].size);
-		dst->size += mb[i].size;
-	}
+	for (i = 0; i < nbuf; i++)
+		strbuf_add(dst, mb[i].ptr, mb[i].size);
 	return 0;
 }
 
 static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
 {
-	unsigned long size = f1->size < f2->size ? f1->size : f2->size;
-	void *ptr = xmalloc(size);
+	struct strbuf out = STRBUF_INIT;
 	xpparam_t xpp;
 	xdemitconf_t xecfg;
 	xdemitcb_t ecb;
@@ -75,11 +72,15 @@ static int generate_common_file(mmfile_t *res, mmfile_t *f1, mmfile_t *f2)
 	xecfg.flags = XDL_EMIT_COMMON;
 	ecb.outf = common_outf;
 
-	res->ptr = ptr;
-	res->size = 0;
+	ecb.priv = &out;
+	if (xdi_diff(f1, f2, &xpp, &xecfg, &ecb) < 0) {
+		strbuf_release(&out);
+		return -1;
+	}
 
-	ecb.priv = res;
-	return xdi_diff(f1, f2, &xpp, &xecfg, &ecb);
+	res->size = out.len; /* avoid long/size_t pointer mismatch below */
+	res->ptr = strbuf_detach(&out, NULL);
+	return 0;
 }
 
 void *merge_blobs(const char *path, struct blob *base, struct blob *our, struct blob *their, unsigned long *size)
-- 
2.7.1.574.gccd43a9



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]