[RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst

Elijah Newren <newren@xxxxxxxxx> · Sat, 21 Apr 2018 12:37:36 -0700

Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
Since we create a temporary index in o->result, then discard o->dst_index
and overwrite it with o->result, when o->src_index == o->dst_index it is
safe to just reuse o->src_index's split_index for o->result.  However,
o->src_index and o->dst_index are specified separately in order to allow
callers to have these be different.  In such a case, reusing
o->src_index's split_index for o->result will cause the split_index to be
shared.  If either index then has entries replaced or removed, it will
result in the other index referring to free()'d memory.

Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
---

I still haven't wrapped my head around the split_index stuff entirely, so
it's possible that

  - the performance optimization isn't even valid when src == dst.  Could
    the original index be different enough from the result that we don't
    want its split_index?

  - there's a better, more performant fix or there is some way to actually
    share a split_index between two independent index_state objects.

However, with this fix, all the tests pass both normally and under
GIT_TEST_SPLIT_INDEX=DareISayYes.  Without this patch, when
GIT_TEST_SPLIT_INDEX is set, my directory rename detection series will fail
several tests, as reported by SZEDER.

 unpack-trees.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 79fd97074e..b670415d4c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
 	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
 	o->result.version = o->src_index->version;
-	o->result.split_index = o->src_index->split_index;
-	if (o->result.split_index)
+	if (!o->src_index->split_index) {
+		o->result.split_index = NULL;
+	} else if (o->src_index == o->dst_index) {
+		/*
+		 * o->dst_index (and thus o->src_index) will be discarded
+		 * and overwritten with o->result at the end of this function,
+		 * so just use src_index's split_index to avoid having to
+		 * create a new one.
+		 */
+		o->result.split_index = o->src_index->split_index;
 		o->result.split_index->refcount++;
+	} else {
+		o->result.split_index = init_split_index(&o->result);
+	}
 	hashcpy(o->result.sha1, o->src_index->sha1);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
-- 
2.17.0.296.gaac25b4b81