[PATCH v2 4/4] gitdiffcore doc: mention new preliminary step for rename detection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Elijah Newren <newren@xxxxxxxxx>

The last few patches have introduced a new preliminary step when rename
detection is on but both break detection and copy detection are off.
Document this new step.  While we're at it, add a testcase that checks
the new behavior as well.

Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
---
 Documentation/gitdiffcore.txt | 15 +++++++++++++++
 t/t4001-diff-rename.sh        | 24 ++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
index c970d9fe438a..954ae3ef1082 100644
--- a/Documentation/gitdiffcore.txt
+++ b/Documentation/gitdiffcore.txt
@@ -168,6 +168,21 @@ a similarity score different from the default of 50% by giving a
 number after the "-M" or "-C" option (e.g. "-M8" to tell it to use
 8/10 = 80%).
 
+Note that when rename detection is on but both copy and break
+detection are off, rename detection adds a preliminary step that first
+checks files with the same basename.  If files with the same basename
+are sufficiently similar, it will mark them as renames and exclude
+them from the later quadratic step (the one that pairwise compares all
+unmatched files to find the "best" matches, determined by the highest
+content similarity).  So, for example, if docs/extensions.txt and
+docs/config/extensions.txt have similar content, then they will be
+marked as a rename even if it turns out that docs/extensions.txt was
+more similar to src/extension-checks.c.  At most, one comparison is
+done per file in this preliminary pass; so if there are several
+extensions.txt files throughout the directory hierarchy that were
+added and deleted, this preliminary step will be skipped for those
+files.
+
 Note.  When the "-C" option is used with `--find-copies-harder`
 option, 'git diff-{asterisk}' commands feed unmodified filepairs to
 diffcore mechanism as well as modified ones.  This lets the copy
diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
index c16486a9d41a..bf62537c29a0 100755
--- a/t/t4001-diff-rename.sh
+++ b/t/t4001-diff-rename.sh
@@ -262,4 +262,28 @@ test_expect_success 'diff-tree -l0 defaults to a big rename limit, not zero' '
 	grep "myotherfile.*myfile" actual
 '
 
+test_expect_success 'basename similarity vs best similarity' '
+	mkdir subdir &&
+	test_write_lines line1 line2 line3 line4 line5 \
+			 line6 line7 line8 line9 line10 >subdir/file.txt &&
+	git add subdir/file.txt &&
+	git commit -m "base txt" &&
+
+	git rm subdir/file.txt &&
+	test_write_lines line1 line2 line3 line4 line5 \
+			  line6 line7 line8 >file.txt &&
+	test_write_lines line1 line2 line3 line4 line5 \
+			  line6 line7 line8 line9 >file.md &&
+	git add file.txt file.md &&
+	git commit -a -m "rename" &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >actual &&
+	# subdir/file.txt is 89% similar to file.md, 78% similar to file.txt,
+	# but since same basenames are checked first...
+	cat >expected <<-\EOF &&
+	A	file.md
+	R078	subdir/file.txt	file.txt
+	EOF
+	test_cmp expected actual
+'
+
 test_done
-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux