[PATCHv2 3/3] Teach --dirstat to not completely ignore rearranged lines within a file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Currently, the --dirstat analysis fails to detect when lines within a
file are rearranged, because the "damage" calculated by show_dirstat()
is 0. However, if the SHA1 sum has changed, we already now that there
should be at least some minimum amount of damage.

This patch teaches show_dirstat() to assign a minimum amount of damage
(== 1) to entries for which the analysis otherwise yields zero damage.
Obviously this is not a complete fix, but it's at least better to
underrepresent these changes, rather than simply pretending that they
don't exist.

Also, with the added SHA1 comparison, we can safely skip the --dirstat
analysis when the SHA1s do happen to match (e.g. for a pure file rename)

Signed-off-by: Johan Herland <johan@xxxxxxxxxxx>
---
 Documentation/diff-options.txt                |    4 ++--
 diff.c                                        |   19 ++++++++++++++++++-
 t/t4013-diff-various.sh                       |    2 --
 t/t4013/diff.diff_--dirstat_initial_rearrange |    1 +
 4 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 23772d6..7e4bd42 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -74,8 +74,8 @@ endif::git-format-patch[]
 	counted for the parent directory, unless `--cumulative` is used.
 +
 Note that the `--dirstat` option computes the changes while ignoring
-pure code movements within a file.  In other words, rearranging lines
-in a file is not counted as a change.
+the amount of pure code movements within a file.  In other words,
+rearranging lines in a file is not counted as much as other changes.
 
 --dirstat-by-file[=<limit>]::
 	Same as `--dirstat`, but counts changed files instead of lines.
diff --git a/diff.c b/diff.c
index a224048..3e0bc1f 100644
--- a/diff.c
+++ b/diff.c
@@ -1547,6 +1547,16 @@ static void show_dirstat(struct diff_options *options)
 		else
 			content_changed = 1;
 
+		if (!content_changed) {
+			/*
+			 * The SHA1 has not changed, so pre-/post-content is
+			 * identical. We can therefore skip looking at the
+			 * file contents altogether.
+			 */
+			damage = 0;
+			goto found_damage;
+		}
+
 		if (DIFF_OPT_TST(options, DIRSTAT_BY_FILE)) {
 			/*
 			 * In --dirstat-by-file mode, we don't really need to
@@ -1555,7 +1565,7 @@ static void show_dirstat(struct diff_options *options)
 			 * add this file to the list of results
 			 * (with each file contributing equal damage).
 			 */
-			damage = content_changed ? 1 : 0;
+			damage = 1;
 			goto found_damage;
 		}
 
@@ -1582,8 +1592,15 @@ static void show_dirstat(struct diff_options *options)
 		 * Original minus copied is the removed material,
 		 * added is the new material.  They are both damages
 		 * made to the preimage.
+		 * If the resulting damage is zero, we know that
+		 * diffcore_count_changes() considers the two entries to
+		 * be identical, but since content_changed is true, we
+		 * know that there must have been _some_ kind of change,
+		 * so we force all entries to have damage > 0.
 		 */
 		damage = (p->one->size - copied) + added;
+		if (!damage)
+			damage = 1;
 
 found_damage:
 		ALLOC_GROW(dir.files, dir.nr + 1, dir.alloc);
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 6428a90..93a6f20 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -300,9 +300,7 @@ diff --no-index --name-status -- dir2 dir
 diff --no-index dir dir3
 diff master master^ side
 diff --dirstat master~1 master~2
-# --dirstat doesn't notice changes that simply rearrange existing lines
 diff --dirstat initial rearrange
-# ...but --dirstat-by-file does notice changes that only rearrange lines
 diff --dirstat-by-file initial rearrange
 EOF
 
diff --git a/t/t4013/diff.diff_--dirstat_initial_rearrange b/t/t4013/diff.diff_--dirstat_initial_rearrange
index fb2e17d..5fb02c1 100644
--- a/t/t4013/diff.diff_--dirstat_initial_rearrange
+++ b/t/t4013/diff.diff_--dirstat_initial_rearrange
@@ -1,2 +1,3 @@
 $ git diff --dirstat initial rearrange
+ 100.0% dir/
 $
-- 
1.7.5.rc1.3.g4d7b

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]