[PATCH v2] diffcore-rename: fix BUG when break detection and --follow used together

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Elijah Newren <newren@xxxxxxxxx>

Prior to commit 9db2ac56168e (diffcore-rename: accelerate rename_dst
setup, 2020-12-11), the function add_rename_dst() resulted in quadratic
runtime since each call inserted the new entry into the array in sorted
order.  The reason for the sorted order requirement was so that
locate_rename_dst(), used when break detection is turned on, could find
the appropriate entry in logarithmic time via bisection on string
comparisons.  (It's better to be quadratic in moving pointers than
quadratic in string comparisons, so this made some sense.)  However,
since break detection always sticks the broken pairs adjacent to each
other, that commit decided to simply append entries to rename_dst, and
record the mapping of (filename) -> (index within rename_dst) via a
strintmap.  Doing this relied on the fact that when adding the source of
a broken pair via register_rename_src(), that the next item we'd process
was the other half of the same broken pair and would be added to
rename_dst via add_rename_dst().  This assumption was fine under break
detection alone, but the combination of break detection and
single_follow violated that assumption because of this code:

		else if (options->single_follow &&
			 strcmp(options->single_follow, p->two->path))
			continue; /* not interested */

which would end up skipping calling add_rename_dst() below that point.
Since I knew I was assuming that the dst pair of a break would always be
added right after the src pair of a break, I added a new BUG() directive
as part of that commit later on at time of use that would check my
assumptions held.  That BUG() didn't trip for nearly 4 years...which
sadly meant I had long since forgotten the related details.  Anyway...

When the dst half of a broken pair is skipped like this, it means that
not only could my recorded index be invalid (just past the end of the
array), it could also point to some unrelated dst that just happened to
be the next one added to the array.  So, to fix this, we need to add a
little more safety around the checks for the recorded break_idx.

It turns out that making a testcase to trigger this is quite the
challenge.  I actually added two testscases:
  * One testcase which uses --follow incorrectly (it uses its single
    pathspec to specifying something other than a single filename), and
    which triggers the same bug reported-by Olaf.  This triggers a
    special case within locate_rename_dst() where idx evaluates to 0
    and rename_dst is NULL, meaning that our return value of
    &rename_dst[idx] happens to evaluate to NULL as well.  This
    addressing of an index into a NULL array hints at deeper problems,
    which are raised in the next testcase...
  * A second testcase which when run under valgrind shows that the code
    actually depends upon unintialized memory, in particular the entry
    just after the end of the rename_dst array.

In short, when the two rare options -B and --follow are used together,
fix the accidental find of the wrong dst entry (which would often be
uninitialized memory just past the end of the array, but also could
have just been a dst for an unrelated path if no dst was recorded for
the expected path).  Do so by adding a little more care around checking
the recorded indices in break_idx.

Reported-by: Olaf Hering <olaf@xxxxxxxxx>
Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
---
    diffcore-rename: fix BUG when break detection and --follow used together
    
    Bug dates back to Git v2.31.0, and was discovered and reported about
    four years later over at
    https://lore.kernel.org/git/20240920112228.3d1130f5.olaf@xxxxxxxxx/.
    Sadly, took me about half a year to get back to this one...
    
    Changes since v1:
    
     * Added a testcase, and extended the commit message slightly

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1876%2Fnewren%2Ffix-break-follow-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1876/newren/fix-break-follow-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1876

Range-diff vs v1:

 1:  e14c1193905 ! 1:  a68e7fe19f7 diffcore-rename: fix BUG when break detection and --follow used together
     @@ Commit message
          be the next one added to the array.  So, to fix this, we need to add a
          little more safety around the checks for the recorded break_idx.
      
     -    It turns out that making a testcase to trigger this is a bit challenging
     -    too.  I added a simple testcase which tickles the necessary area, but
     -    running it normally actually passes for me.  However, running it under
     -    valgrind shows that it is depending upon uninitialized memory.  I
     -    suspect that to get a reliable reproduction case, I might need to have
     -    several more paths involved, but that might make the testcase more
     -    difficult to understand.  So, I instead just embedded a warning within
     -    the testname that the test triggered uninitialized memory use.
     +    It turns out that making a testcase to trigger this is quite the
     +    challenge.  I actually added two testscases:
     +      * One testcase which uses --follow incorrectly (it uses its single
     +        pathspec to specifying something other than a single filename), and
     +        which triggers the same bug reported-by Olaf.  This triggers a
     +        special case within locate_rename_dst() where idx evaluates to 0
     +        and rename_dst is NULL, meaning that our return value of
     +        &rename_dst[idx] happens to evaluate to NULL as well.  This
     +        addressing of an index into a NULL array hints at deeper problems,
     +        which are raised in the next testcase...
     +      * A second testcase which when run under valgrind shows that the code
     +        actually depends upon unintialized memory, in particular the entry
     +        just after the end of the rename_dst array.
      
     -    In short, when these two rare options are used together, fix the
     -    accidental find of the wrong dst entry (which would often be
     -    uninitialized memory just past the end of the array), by adding a little
     -    more care around the recorded indices for break_idx.
     +    In short, when the two rare options -B and --follow are used together,
     +    fix the accidental find of the wrong dst entry (which would often be
     +    uninitialized memory just past the end of the array, but also could
     +    have just been a dst for an unrelated path if no dst was recorded for
     +    the expected path).  Do so by adding a little more care around checking
     +    the recorded indices in break_idx.
      
          Reported-by: Olaf Hering <olaf@xxxxxxxxx>
          Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
     @@ t/t4206-log-follow-harder-copies.sh: test_expect_success 'validate the output.'
       	compare_diff_patch current expected
       '
       
     -+test_expect_success 'log --follow -B does not die or use uninitialized memory' '
     ++test_expect_success 'log --follow -B does not BUG' '
      +	git switch --orphan break_and_follow_are_icky_so_use_both &&
     ++
     ++	test_seq 1 127 >numbers &&
     ++	git add numbers &&
     ++	git commit -m "numbers" &&
     ++
     ++	printf "%s\n" A B C D E F G H I J K L M N O Q R S T U V W X Y Z >pool &&
     ++	echo changed >numbers &&
     ++	git add pool numbers &&
     ++	git commit -m "pool" &&
     ++
     ++	git log -1 -B --raw --follow -- "p*"
     ++'
     ++
     ++test_expect_success 'log --follow -B does not die or use uninitialized memory' '
      +	printf "%s\n" A B C D E F G H I J K L M N O P Q R S T U V W X Y Z >z &&
      +	git add z &&
      +	git commit -m "Initial" &&


 diffcore-rename.c                   |  9 ++++----
 t/t4206-log-follow-harder-copies.sh | 32 +++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 10bb0321b10..cb4be5be63c 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -33,7 +33,7 @@ static struct diff_rename_dst *locate_rename_dst(struct diff_filepair *p)
 {
 	/* Lookup by p->ONE->path */
 	int idx = break_idx ? strintmap_get(break_idx, p->one->path) : -1;
-	return (idx == -1) ? NULL : &rename_dst[idx];
+	return (idx == -1 || idx == rename_dst_nr) ? NULL : &rename_dst[idx];
 }
 
 /*
@@ -1668,9 +1668,10 @@ void diffcore_rename_extended(struct diff_options *options,
 			if (DIFF_PAIR_BROKEN(p)) {
 				/* broken delete */
 				struct diff_rename_dst *dst = locate_rename_dst(p);
-				if (!dst)
-					BUG("tracking failed somehow; failed to find associated dst for broken pair");
-				if (dst->is_rename)
+				if (options->single_follow && dst &&
+				    strcmp(dst->p->two->path, p->two->path))
+					dst = NULL;
+				if (dst && dst->is_rename)
 					/* counterpart is now rename/copy */
 					pair_to_free = p;
 			}
diff --git a/t/t4206-log-follow-harder-copies.sh b/t/t4206-log-follow-harder-copies.sh
index bcab71c8e84..190c4843211 100755
--- a/t/t4206-log-follow-harder-copies.sh
+++ b/t/t4206-log-follow-harder-copies.sh
@@ -54,4 +54,36 @@ test_expect_success 'validate the output.' '
 	compare_diff_patch current expected
 '
 
+test_expect_success 'log --follow -B does not BUG' '
+	git switch --orphan break_and_follow_are_icky_so_use_both &&
+
+	test_seq 1 127 >numbers &&
+	git add numbers &&
+	git commit -m "numbers" &&
+
+	printf "%s\n" A B C D E F G H I J K L M N O Q R S T U V W X Y Z >pool &&
+	echo changed >numbers &&
+	git add pool numbers &&
+	git commit -m "pool" &&
+
+	git log -1 -B --raw --follow -- "p*"
+'
+
+test_expect_success 'log --follow -B does not die or use uninitialized memory' '
+	printf "%s\n" A B C D E F G H I J K L M N O P Q R S T U V W X Y Z >z &&
+	git add z &&
+	git commit -m "Initial" &&
+
+	test_seq 1 130 >z &&
+	echo lame >somefile &&
+	git add z somefile &&
+	git commit -m "Rewrite z, introduce lame somefile" &&
+
+	echo Content >somefile &&
+	git add somefile &&
+	git commit -m "Rewrite somefile" &&
+
+	git log -B --follow somefile
+'
+
 test_done

base-commit: f93ff170b93a1782659637824b25923245ac9dd1
-- 
gitgitgadget




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux