Currently, the --dirstat analysis fails to detect when lines within a file are rearranged, because the "damage" calculated by show_dirstat() is 0. However, if the SHA1 sum has changed, we already now that there should be at least some minimum amount of damage. This patch teaches show_dirstat() to assign a minimum amount of damage (== 1) to entries for which the analysis otherwise yields zero damage. Obviously this is not a complete fix, but it's at least better to underrepresent these changes, rather than simply pretending that they don't exist. Also, with the added SHA1 comparison, we can safely skip the --dirstat analysis when the SHA1s do happen to match (e.g. for a pure file rename) Signed-off-by: Johan Herland <johan@xxxxxxxxxxx> --- Documentation/diff-options.txt | 4 ++-- diff.c | 19 ++++++++++++++++++- t/t4013-diff-various.sh | 2 -- t/t4013/diff.diff_--dirstat_initial_rearrange | 1 + 4 files changed, 21 insertions(+), 5 deletions(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 23772d6..7e4bd42 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -74,8 +74,8 @@ endif::git-format-patch[] counted for the parent directory, unless `--cumulative` is used. + Note that the `--dirstat` option computes the changes while ignoring -pure code movements within a file. In other words, rearranging lines -in a file is not counted as a change. +the amount of pure code movements within a file. In other words, +rearranging lines in a file is not counted as much as other changes. --dirstat-by-file[=<limit>]:: Same as `--dirstat`, but counts changed files instead of lines. diff --git a/diff.c b/diff.c index a224048..3e0bc1f 100644 --- a/diff.c +++ b/diff.c @@ -1547,6 +1547,16 @@ static void show_dirstat(struct diff_options *options) else content_changed = 1; + if (!content_changed) { + /* + * The SHA1 has not changed, so pre-/post-content is + * identical. We can therefore skip looking at the + * file contents altogether. + */ + damage = 0; + goto found_damage; + } + if (DIFF_OPT_TST(options, DIRSTAT_BY_FILE)) { /* * In --dirstat-by-file mode, we don't really need to @@ -1555,7 +1565,7 @@ static void show_dirstat(struct diff_options *options) * add this file to the list of results * (with each file contributing equal damage). */ - damage = content_changed ? 1 : 0; + damage = 1; goto found_damage; } @@ -1582,8 +1592,15 @@ static void show_dirstat(struct diff_options *options) * Original minus copied is the removed material, * added is the new material. They are both damages * made to the preimage. + * If the resulting damage is zero, we know that + * diffcore_count_changes() considers the two entries to + * be identical, but since content_changed is true, we + * know that there must have been _some_ kind of change, + * so we force all entries to have damage > 0. */ damage = (p->one->size - copied) + added; + if (!damage) + damage = 1; found_damage: ALLOC_GROW(dir.files, dir.nr + 1, dir.alloc); diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh index 6428a90..93a6f20 100755 --- a/t/t4013-diff-various.sh +++ b/t/t4013-diff-various.sh @@ -300,9 +300,7 @@ diff --no-index --name-status -- dir2 dir diff --no-index dir dir3 diff master master^ side diff --dirstat master~1 master~2 -# --dirstat doesn't notice changes that simply rearrange existing lines diff --dirstat initial rearrange -# ...but --dirstat-by-file does notice changes that only rearrange lines diff --dirstat-by-file initial rearrange EOF diff --git a/t/t4013/diff.diff_--dirstat_initial_rearrange b/t/t4013/diff.diff_--dirstat_initial_rearrange index fb2e17d..5fb02c1 100644 --- a/t/t4013/diff.diff_--dirstat_initial_rearrange +++ b/t/t4013/diff.diff_--dirstat_initial_rearrange @@ -1,2 +1,3 @@ $ git diff --dirstat initial rearrange + 100.0% dir/ $ -- 1.7.5.rc1.3.g4d7b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html