Sorry it has taken so long to post this re-roll. Changes since V2: * Patches 1-3 are new and fix an existing bug. * Patch 8 includes Peff's unused parameter fix. * Patch 11 has been updated to fix a bug fix in V2. * Patch 13 has an expanded commit message explaining a change in behavior for lines starting with a form-feed. * Updated benchmark results. The bug fix in patch 3 degrades the performance, but by the end of the series the timings are the same as V2 - see the range diff. V2 Cover Letter: Thanks to Ævar and Elijah for their comments, I've reworded the commit messages, addressed the enum initialization issue in patch 2 (now 3) and added some perf tests. There are two new patches in this round. The first patch is new and adds the perf tests suggested by Ævar, the penultimate patch is also new and coverts the existing code to use a designated initializer. I've converted the benchmark results in the commit messages to use the new tests, the percentage changes are broadly similar to the previous results though I ended up running them on a different computer this time. V1 cover letter: The current implementation of diff --color-moved-ws=allow-indentation-change is considerably slower that the implementation of diff --color-moved which is in turn slower than a regular diff. This patch series starts with a couple of bug fixes and then reworks the implementation of diff --color-moved and diff --color-moved-ws=allow-indentation-change to speed them up on large diffs. The time to run git diff --color-moved --no-color-moved-ws v2.28.0 v2.29.0 is reduced by 33% and the time to run git diff --color-moved --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0 is reduced by 88%. There is a small slowdown for commit sized diffs with --color-moved - the time to run git log -p --color-moved --no-color-moved-ws --no-merges -n1000 v2.29.0 is increased by 2% on recent processors. On older processors these patches reduce the running time in all cases that I've tested. In general the larger the diff the larger the speed up. As an extreme example the time to run diff --color-moved --color-moved-ws=allow-indentation-change v2.25.0 v2.30.0 goes down from 8 minutes to 6 seconds. Phillip Wood (15): diff --color-moved: add perf tests diff --color-moved: clear all flags on blocks that are too short diff --color-moved: factor out function diff --color-moved: rewind when discarding pmb diff --color-moved=zebra: fix alternate coloring diff --color-moved: avoid false short line matches and bad zerba coloring diff: simplify allow-indentation-change delta calculation diff --color-moved-ws=allow-indentation-change: simplify and optimize diff --color-moved: call comparison function directly diff --color-moved: unify moved block growth functions diff --color-moved: shrink potential moved blocks as we go diff --color-moved: stop clearing potential moved blocks diff --color-moved-ws=allow-indentation-change: improve hash lookups diff: use designated initializers for emitted_diff_symbol diff --color-moved: intern strings diff.c | 431 +++++++++++++------------------ t/perf/p4002-diff-color-moved.sh | 45 ++++ t/t4015-diff-whitespace.sh | 205 ++++++++++++++- 3 files changed, 425 insertions(+), 256 deletions(-) create mode 100755 t/perf/p4002-diff-color-moved.sh base-commit: 211eca0895794362184da2be2a2d812d070719d3 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-981%2Fphillipwood%2Fwip%2Fdiff-color-moved-tweaks-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-981/phillipwood/wip/diff-color-moved-tweaks-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/981 Range-diff vs v2: 1: 8fc8914a37b = 1: 8fc8914a37b diff --color-moved: add perf tests -: ----------- > 2: e9daed2360c diff --color-moved: clear all flags on blocks that are too short -: ----------- > 3: 658aec2670c diff --color-moved: factor out function -: ----------- > 4: a30f52d7f15 diff --color-moved: rewind when discarding pmb 2: 9b4e4d2674a ! 5: 1dde206b7b1 diff --color-moved=zebra: fix alternate coloring @@ diff.c: static void mark_color_as_moved(struct diff_options *o, switch (l->s) { case DIFF_SYMBOL_PLUS: @@ diff.c: static void mark_color_as_moved(struct diff_options *o, - } + &pmb, &pmb_alloc, + &pmb_nr); - if (adjust_last_block(o, n, block_length) && -- pmb_nr && last_symbol != l->s) -+ pmb_nr && last_symbol == l->s) +- if (contiguous && pmb_nr && last_symbol != l->s) ++ if (contiguous && pmb_nr && last_symbol == l->s) flipped_block = (flipped_block + 1) % 2; else flipped_block = 0; 3: 5512145c70f ! 6: 2717ff500d2 diff --color-moved: avoid false short line matches and bad zerba coloring @@ diff.c: static void mark_color_as_moved(struct diff_options *o, + if (pmb_nr && (!match || l->s != moved_symbol)) { int i; - adjust_last_block(o, n, block_length); + if (!adjust_last_block(o, n, block_length) && @@ diff.c: static void mark_color_as_moved(struct diff_options *o, pmb_nr = 0; block_length = 0; @@ diff.c: static void mark_color_as_moved(struct diff_options *o, continue; } @@ diff.c: static void mark_color_as_moved(struct diff_options *o, - } + &pmb, &pmb_alloc, + &pmb_nr); - if (adjust_last_block(o, n, block_length) && -- pmb_nr && last_symbol == l->s) -+ pmb_nr && moved_symbol == l->s) +- if (contiguous && pmb_nr && last_symbol == l->s) ++ if (contiguous && pmb_nr && moved_symbol == l->s) flipped_block = (flipped_block + 1) % 2; else flipped_block = 0; 4: 93fdef30d64 = 7: f96fa71d53c diff: simplify allow-indentation-change delta calculation 5: 6b7a8aed4ec ! 8: 324b689c915 diff --color-moved-ws=allow-indentation-change: simplify and optimize @@ Commit message git diff --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0 by 93% compared to master and simplifies the code. - Test HEAD^ HEAD + Test HEAD^ HEAD --------------------------------------------------------------------------------------------------------------- - 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41( 0.38+0.03) 0.41(0.37+0.04) +0.0% - 4002.2: diff --color-moved --no-color-moved-ws large change 0.83( 0.79+0.04) 0.82(0.79+0.02) -1.2% - 4002.3: diff --color-moved-ws=allow-indentation-change large change 13.68(13.59+0.07) 0.92(0.89+0.03) -93.3% - 4002.4: log --no-color-moved --no-color-moved-ws 1.31( 1.22+0.08) 1.31(1.21+0.10) +0.0% - 4002.5: log --color-moved --no-color-moved-ws 1.47( 1.40+0.07) 1.47(1.36+0.10) +0.0% - 4002.6: log --color-moved-ws=allow-indentation-change 1.87( 1.77+0.09) 1.50(1.41+0.09) -19.8% + 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.35+0.03) 0.38(0.35+0.03) +0.0% + 4002.2: diff --color-moved --no-color-moved-ws large change 0.86 (0.80+0.06) 0.87(0.83+0.04) +1.2% + 4002.3: diff --color-moved-ws=allow-indentation-change large change 19.01(18.93+0.06) 0.97(0.92+0.04) -94.9% + 4002.4: log --no-color-moved --no-color-moved-ws 1.16 (1.06+0.09) 1.17(1.06+0.10) +0.9% + 4002.5: log --color-moved --no-color-moved-ws 1.32 (1.25+0.07) 1.32(1.24+0.08) +0.0% + 4002.6: log --color-moved-ws=allow-indentation-change 1.71 (1.64+0.06) 1.36(1.25+0.10) -20.5% + Test master HEAD + --------------------------------------------------------------------------------------------------------------- + 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.35+0.03) +0.0% + 4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.87(0.83+0.04) +8.7% + 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.97(0.92+0.04) -93.2% + 4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.17(1.06+0.10) +1.7% + 4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.32(1.24+0.08) +1.5% + 4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.36(1.25+0.10) -20.0% + + Helped-by: Jeff King <peff@xxxxxxxx> Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> ## diff.c ## @@ diff.c: static int compute_ws_delta(const struct emitted_diff_symbol *a, + return 1; + } - static int cmp_in_block_with_wsd(const struct diff_options *o, - const struct moved_entry *cur, +-static int cmp_in_block_with_wsd(const struct diff_options *o, +- const struct moved_entry *cur, - const struct moved_entry *match, - struct moved_block *pmb, - int n) -+ const struct emitted_diff_symbol *l, -+ struct moved_block *pmb) - { +-{ - struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; - int al = cur->es->len, bl = match->es->len, cl = l->len; ++static int cmp_in_block_with_wsd(const struct moved_entry *cur, ++ const struct emitted_diff_symbol *l, ++ struct moved_block *pmb) ++{ + int al = cur->es->len, bl = l->len; const char *a = cur->es->line, - *b = match->es->line, @@ diff.c: static void pmb_advance_or_null(struct diff_options *o, + struct moved_entry *prev = pmb[i].match; + struct moved_entry *cur = (prev && prev->next_line) ? + prev->next_line : NULL; -+ if (cur && !cmp_in_block_with_wsd(o, cur, l, &pmb[i])) { ++ if (cur && !cmp_in_block_with_wsd(cur, l, &pmb[i])) { /* Advance to the next line */ - pmb[i].match = pmb[i].match->next_line; + pmb[i].match = cur; 6: cfbdd447eee ! 9: f142f33276a diff --color-moved: call comparison function directly @@ Commit message Test HEAD^ HEAD ------------------------------------------------------------------------------------------------------------- - 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41(0.37+0.04) 0.41(0.39+0.02) +0.0% - 4002.2: diff --color-moved --no-color-moved-ws large change 0.82(0.79+0.02) 0.83(0.79+0.03) +1.2% - 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.92(0.89+0.03) 0.91(0.85+0.05) -1.1% - 4002.4: log --no-color-moved --no-color-moved-ws 1.31(1.21+0.10) 1.33(1.22+0.10) +1.5% - 4002.5: log --color-moved --no-color-moved-ws 1.47(1.36+0.10) 1.47(1.39+0.08) +0.0% - 4002.6: log --color-moved-ws=allow-indentation-change 1.50(1.41+0.09) 1.51(1.42+0.09) +0.7% + 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.35+0.03) 0.38(0.32+0.06) +0.0% + 4002.2: diff --color-moved --no-color-moved-ws large change 0.87(0.83+0.04) 0.87(0.80+0.06) +0.0% + 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.92+0.04) 0.97(0.93+0.04) +0.0% + 4002.4: log --no-color-moved --no-color-moved-ws 1.17(1.06+0.10) 1.16(1.10+0.05) -0.9% + 4002.5: log --color-moved --no-color-moved-ws 1.32(1.24+0.08) 1.31(1.22+0.09) -0.8% + 4002.6: log --color-moved-ws=allow-indentation-change 1.36(1.25+0.10) 1.35(1.25+0.10) -0.7% Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> 7: 73ce9b54e86 ! 10: 8f3ea865dd3 diff --color-moved: unify moved block growth functions @@ diff.c: static void pmb_advance_or_null(struct diff_options *o, - struct moved_entry *prev = pmb[i].match; - struct moved_entry *cur = (prev && prev->next_line) ? - prev->next_line : NULL; -- if (cur && !cmp_in_block_with_wsd(o, cur, l, &pmb[i])) { +- if (cur && !cmp_in_block_with_wsd(cur, l, &pmb[i])) { - /* Advance to the next line */ + if (o->color_moved_ws_handling & + COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) + match = cur && -+ !cmp_in_block_with_wsd(o, cur, l, &pmb[i]); ++ !cmp_in_block_with_wsd(cur, l, &pmb[i]); + else + match = cur && + xdiff_compare_lines(cur->es->line, cur->es->len, 8: ef8ce0e6ebc ! 11: 078c04d4a66 diff --color-moved: shrink potential moved blocks as we go @@ diff.c: static void add_lines_to_move_detection(struct diff_options *o, struct moved_entry *prev = pmb[i].match; struct moved_entry *cur = (prev && prev->next_line) ? @@ diff.c: static void pmb_advance_or_null(struct diff_options *o, + match = cur && xdiff_compare_lines(cur->es->line, cur->es->len, l->line, l->len, flags); - if (match) +- if (match) - pmb[i].match = cur; - else - moved_block_clear(&pmb[i]); -+ pmb[j++].match = cur; - } +- } -} - -static int shrink_potential_moved_blocks(struct moved_block *pmb, @@ diff.c: static void pmb_advance_or_null(struct diff_options *o, - memset(&pmb[rp], 0, sizeof(pmb[rp])); - rp--; - lp++; -- } -- } ++ if (match) { ++ pmb[j] = pmb[i]; ++ pmb[j++].match = cur; + } + } - - /* Remember the number of running sets */ - return rp + 1; + *pmb_nr = j; } - /* + static void fill_potential_moved_blocks(struct diff_options *o, @@ diff.c: static void mark_color_as_moved(struct diff_options *o, continue; } @@ diff.c: static void mark_color_as_moved(struct diff_options *o, + pmb_advance_or_null(o, l, pmb, &pmb_nr); if (pmb_nr == 0) { - /* + int contiguous = adjust_last_block(o, n, block_length); 9: 9d0a042eae1 ! 12: 618371471a0 diff --color-moved: stop clearing potential moved blocks @@ diff.c: static void mark_color_as_moved(struct diff_options *o, if (pmb_nr && (!match || l->s != moved_symbol)) { - int i; - - adjust_last_block(o, n, block_length); + if (!adjust_last_block(o, n, block_length) && + block_length > 1) { + /* +@@ diff.c: static void mark_color_as_moved(struct diff_options *o, + match = NULL; + n -= block_length; + } - for(i = 0; i < pmb_nr; i++) - moved_block_clear(&pmb[i]); pmb_nr = 0; 10: dd365ad115f ! 13: 6a8e9a2724d diff --color-moved-ws=allow-indentation-change: improve hash lookups @@ Commit message for the different modes but after the next two commits there is no measurable benefit in doing so. + There is a change in behavior for lines that begin with a form-feed or + vertical-tab character. Since b46054b374 ("xdiff: use + git-compat-util", 2019-04-11) xdiff does not treat '\f' or '\v' as + whitespace characters. This means that lines starting with those + characters are never considered to be blank and never match a line + that does not start with the same character. After this patch a line + matching "^[\f\v\r]*[ \t]*$" is considered to be blank by + --color-moved-ws=allow-indentation-change and lines beginning + "^[\f\v\r]*[ \t]*" can match another line if the suffixes match. This + changes the output of git show for d18f76dccf ("compat/regex: use the + regex engine from gawk for compat", 2010-08-17) as some lines in the + pre-image before a moved block that contain '\f' are now considered + moved as well as they match a blank line before the moved lines in the + post-image. This commit updates one of the tests to reflect this + change. + Test HEAD^ HEAD -------------------------------------------------------------------------------------------------------------- - 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41(0.38+0.03) 0.41(0.36+0.04) +0.0% - 4002.2: diff --color-moved --no-color-moved-ws large change 0.82(0.76+0.05) 0.84(0.79+0.04) +2.4% - 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.91(0.88+0.03) 0.81(0.74+0.06) -11.0% - 4002.4: log --no-color-moved --no-color-moved-ws 1.32(1.21+0.10) 1.31(1.19+0.11) -0.8% - 4002.5: log --color-moved --no-color-moved-ws 1.47(1.37+0.10) 1.47(1.36+0.11) +0.0% - 4002.6: log --color-moved-ws=allow-indentation-change 1.51(1.42+0.09) 1.48(1.37+0.10) -2.0% + 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0% + 4002.2: diff --color-moved --no-color-moved-ws large change 0.86(0.82+0.04) 0.88(0.84+0.04) +2.3% + 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.94+0.03) 0.86(0.81+0.05) -11.3% + 4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.09) 1.16(1.06+0.09) +0.0% + 4002.5: log --color-moved --no-color-moved-ws 1.32(1.26+0.06) 1.33(1.27+0.05) +0.8% + 4002.6: log --color-moved-ws=allow-indentation-change 1.35(1.29+0.06) 1.33(1.24+0.08) -1.5% Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> @@ diff.c: static void fill_es_indent_data(struct emitted_diff_symbol *es) + return a_width - b_width; } - static int cmp_in_block_with_wsd(const struct diff_options *o, + static int cmp_in_block_with_wsd(const struct moved_entry *cur, @@ diff.c: static int moved_entry_cmp(const void *hashmap_cmp_fn_data, const void *keydata) { @@ diff.c: static int moved_entry_cmp(const void *hashmap_cmp_fn_data, - a = container_of(eptr, const struct moved_entry, ent); - b = container_of(entry_or_key, const struct moved_entry, ent); -- ++ a = container_of(eptr, const struct moved_entry, ent)->es; ++ b = container_of(entry_or_key, const struct moved_entry, ent)->es; + - if (diffopt->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - /* @@ diff.c: static int moved_entry_cmp(const void *hashmap_cmp_fn_data, - * configuration for the next block is done else where - */ - flags |= XDF_IGNORE_WHITESPACE; -+ a = container_of(eptr, const struct moved_entry, ent)->es; -+ b = container_of(entry_or_key, const struct moved_entry, ent)->es; - +- - return !xdiff_compare_lines(a->es->line, a->es->len, - b->es->line, b->es->len, - flags); @@ diff.c: static struct moved_entry *prepare_entry(struct diff_options *o, hashmap_entry_init(&ret->ent, hash); ret->es = l; -@@ diff.c: static void mark_color_as_moved(struct diff_options *o, - hashmap_for_each_entry_from(hm, match, ent) { - ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); - if (o->color_moved_ws_handling & -- COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) { -- if (compute_ws_delta(l, match->es, -- &pmb[pmb_nr].wsd)) -- pmb[pmb_nr++].match = match; -- } else { -+ COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) -+ pmb[pmb_nr].wsd = compute_ws_delta(l, match->es); -+ else - pmb[pmb_nr].wsd = 0; -- pmb[pmb_nr++].match = match; -- } -+ pmb[pmb_nr++].match = match; - } +@@ diff.c: static void fill_potential_moved_blocks(struct diff_options *o, + hashmap_for_each_entry_from(hm, match, ent) { + ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); + if (o->color_moved_ws_handling & +- COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) { +- if (compute_ws_delta(l, match->es, &(pmb[pmb_nr]).wsd)) +- pmb[pmb_nr++].match = match; +- } else { ++ COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) ++ pmb[pmb_nr].wsd = compute_ws_delta(l, match->es); ++ else + pmb[pmb_nr].wsd = 0; +- pmb[pmb_nr++].match = match; +- } ++ pmb[pmb_nr++].match = match; + } - if (adjust_last_block(o, n, block_length) && + *pmb_p = pmb; @@ diff.c: static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->color_moved) { struct hashmap add_lines, del_lines; @@ diff.c: static void diff_flush_patch_all_file_pairs(struct diff_options *o) hashmap_init(&del_lines, moved_entry_cmp, o, 0); hashmap_init(&add_lines, moved_entry_cmp, o, 0); + + ## t/t4015-diff-whitespace.sh ## +@@ t/t4015-diff-whitespace.sh: EMPTY='' + test_expect_success 'compare mixed whitespace delta across moved blocks' ' + + git reset --hard && +- tr Q_ "\t " <<-EOF >text.txt && +- ${EMPTY} +- ____too short without +- ${EMPTY} ++ tr "^|Q_" "\f\v\t " <<-EOF >text.txt && ++ ^__ ++ |____too short without ++ ^ + ___being grouped across blank line + ${EMPTY} + context +@@ t/t4015-diff-whitespace.sh: test_expect_success 'compare mixed whitespace delta across moved blocks' ' + git add text.txt && + git commit -m "add text.txt" && + +- tr Q_ "\t " <<-EOF >text.txt && ++ tr "^|Q_" "\f\v\t " <<-EOF >text.txt && + context + lines + to +@@ t/t4015-diff-whitespace.sh: test_expect_success 'compare mixed whitespace delta across moved blocks' ' + ${EMPTY} + QQtoo short without + ${EMPTY} +- Q_______being grouped across blank line ++ ^Q_______being grouped across blank line + ${EMPTY} + Q_QThese two lines have had their + indentation reduced by four spaces +@@ t/t4015-diff-whitespace.sh: test_expect_success 'compare mixed whitespace delta across moved blocks' ' + -c core.whitespace=space-before-tab \ + diff --color --color-moved --ws-error-highlight=all \ + --color-moved-ws=allow-indentation-change >actual.raw && +- grep -v "index" actual.raw | test_decode_color >actual && ++ grep -v "index" actual.raw | tr "\f\v" "^|" | test_decode_color >actual && + + cat <<-\EOF >expected && + <BOLD>diff --git a/text.txt b/text.txt<RESET> + <BOLD>--- a/text.txt<RESET> + <BOLD>+++ b/text.txt<RESET> + <CYAN>@@ -1,16 +1,16 @@<RESET> +- <BOLD;MAGENTA>-<RESET> +- <BOLD;MAGENTA>-<RESET><BOLD;MAGENTA> too short without<RESET> +- <BOLD;MAGENTA>-<RESET> ++ <BOLD;MAGENTA>-<RESET><BOLD;MAGENTA>^<RESET><BRED> <RESET> ++ <BOLD;MAGENTA>-<RESET><BOLD;MAGENTA>| too short without<RESET> ++ <BOLD;MAGENTA>-<RESET><BOLD;MAGENTA>^<RESET> + <BOLD;MAGENTA>-<RESET><BOLD;MAGENTA> being grouped across blank line<RESET> + <BOLD;MAGENTA>-<RESET> + <RESET>context<RESET> +@@ t/t4015-diff-whitespace.sh: test_expect_success 'compare mixed whitespace delta across moved blocks' ' + <BOLD;YELLOW>+<RESET> + <BOLD;YELLOW>+<RESET> <BOLD;YELLOW>too short without<RESET> + <BOLD;YELLOW>+<RESET> +- <BOLD;YELLOW>+<RESET> <BOLD;YELLOW> being grouped across blank line<RESET> ++ <BOLD;YELLOW>+<RESET><BOLD;YELLOW>^ being grouped across blank line<RESET> + <BOLD;YELLOW>+<RESET> + <BOLD;CYAN>+<RESET> <BRED> <RESET> <BOLD;CYAN>These two lines have had their<RESET> + <BOLD;CYAN>+<RESET><BOLD;CYAN>indentation reduced by four spaces<RESET> 11: c160222ab3c = 14: ef98a6e7015 diff: use designated initializers for emitted_diff_symbol 12: 753554587f9 ! 15: ae78c05f08d diff --color-moved: intern strings @@ Commit message number of hash lookups a little (calculating the ids still involves one hash lookup per line) but the main benefit is that when growing blocks of potentially moved lines we can replace string comparisons - which involve chasing a pointer with a simple integer comparison. + which involve chasing a pointer with a simple integer comparison. On a + large diff this commit reduces the time to run 'diff --color-moved' by + 37% compared to the previous commit and 31% compared to master, for + 'diff --color-moved-ws=allow-indentation-change' the reduction is 28% + compared to the previous commit and 96% compared to master. There is + little change in the performance of 'git log --patch' as the diffs are + smaller. - On a large diff this commit reduces the time to run - diff --color-moved - by 33% and - diff --color-moved-ws=allow-indentation-change - by 26%. Compared to master the time to run - diff --color-moved-ws=allow-indentation-change - is now reduced by 95% and the overhead compared to --no-color-moved is - reduced to 50%. + Test HEAD^ HEAD + --------------------------------------------------------------------------------------------------------------- + 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0% + 4002.2: diff --color-moved --no-color-moved-ws large change 0.88(0.81+0.06) 0.55(0.50+0.04) -37.5% + 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.85(0.79+0.06) 0.61(0.54+0.06) -28.2% + 4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.08) 1.15(1.09+0.05) -0.9% + 4002.5: log --color-moved --no-color-moved-ws 1.31(1.22+0.08) 1.29(1.19+0.09) -1.5% + 4002.6: log --color-moved-ws=allow-indentation-change 1.32(1.24+0.08) 1.31(1.18+0.13) -0.8% - Compared to the previous commit the time to run - git log --patch --color-moved - is increased slightly, but compared to master there is no change in - run time. - - Test HEAD^ HEAD - -------------------------------------------------------------------------------------------------------------- - 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41(0.36+0.04) 0.41(0.37+0.03) +0.0% - 4002.2: diff --color-moved --no-color-moved-ws large change 0.83(0.79+0.03) 0.55(0.52+0.03) -33.7% - 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.81(0.77+0.04) 0.60(0.55+0.05) -25.9% - 4002.4: log --no-color-moved --no-color-moved-ws 1.30(1.20+0.09) 1.31(1.22+0.08) +0.8% - 4002.5: log --color-moved --no-color-moved-ws 1.46(1.35+0.11) 1.47(1.30+0.16) +0.7% - 4002.6: log --color-moved-ws=allow-indentation-change 1.46(1.38+0.07) 1.47(1.34+0.13) +0.7% - - Test master HEAD - -------------------------------------------------------------------------------------------------------------- - 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.40( 0.36+0.03) 0.41(0.37+0.03) +2.5% - 4002.2: diff --color-moved --no-color-moved-ws large change 0.82( 0.77+0.04) 0.55(0.52+0.03) -32.9% - 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.10(14.04+0.04) 0.60(0.55+0.05) -95.7% - 4002.4: log --no-color-moved --no-color-moved-ws 1.31( 1.21+0.09) 1.31(1.22+0.08) +0.0% - 4002.5: log --color-moved --no-color-moved-ws 1.47( 1.37+0.09) 1.47(1.30+0.16) +0.0% - 4002.6: log --color-moved-ws=allow-indentation-change 1.86( 1.76+0.10) 1.47(1.34+0.13) -21.0% + Test master HEAD + --------------------------------------------------------------------------------------------------------------- + 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.33+0.05) +0.0% + 4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.55(0.50+0.04) -31.2% + 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.61(0.54+0.06) -95.7% + 4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.15(1.09+0.05) +0.0% + 4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.29(1.19+0.09) -0.8% + 4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.31(1.18+0.13) -22.9% Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> @@ diff.c: static void append_emitted_diff_symbol(struct diff_options *o, }; struct moved_block { -@@ diff.c: static int cmp_in_block_with_wsd(const struct diff_options *o, +@@ diff.c: static int cmp_in_block_with_wsd(const struct moved_entry *cur, const struct emitted_diff_symbol *l, struct moved_block *pmb) { @@ diff.c: static int cmp_in_block_with_wsd(const struct diff_options *o, */ delta = b_width - a_width; -@@ diff.c: static int cmp_in_block_with_wsd(const struct diff_options *o, +@@ diff.c: static int cmp_in_block_with_wsd(const struct moved_entry *cur, if (pmb->wsd == INDENT_BLANKLINE) pmb->wsd = delta; @@ diff.c: static void pmb_advance_or_null(struct diff_options *o, int match; @@ diff.c: static void pmb_advance_or_null(struct diff_options *o, match = cur && - !cmp_in_block_with_wsd(o, cur, l, &pmb[i]); + !cmp_in_block_with_wsd(cur, l, &pmb[i]); else - match = cur && - xdiff_compare_lines(cur->es->line, cur->es->len, - l->line, l->len, flags); + match = cur && cur->es->id == l->id; + - if (match) + if (match) { + pmb[j] = pmb[i]; pmb[j++].match = cur; - } +@@ diff.c: static void pmb_advance_or_null(struct diff_options *o, + } + + static void fill_potential_moved_blocks(struct diff_options *o, +- struct hashmap *hm, + struct moved_entry *match, + struct emitted_diff_symbol *l, + struct moved_block **pmb_p, +@@ diff.c: static void fill_potential_moved_blocks(struct diff_options *o, + * The current line is the start of a new block. + * Setup the set of potential blocks. + */ +- hashmap_for_each_entry_from(hm, match, ent) { ++ for (; match; match = match->next_match) { + ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); + if (o->color_moved_ws_handling & + COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) @@ diff.c: static int adjust_last_block(struct diff_options *o, int n, int block_length) /* Find blocks of moved code, delegate actual coloring decision to helper */ @@ diff.c: static void mark_color_as_moved(struct diff_options *o, default: flipped_block = 0; @@ diff.c: static void mark_color_as_moved(struct diff_options *o, - * The current line is the start of a new block. - * Setup the set of potential blocks. - */ -- hashmap_for_each_entry_from(hm, match, ent) { -+ for (; match; match = match->next_match) { - ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); - if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) + */ + n -= block_length; + else +- fill_potential_moved_blocks(o, hm, match, l, ++ fill_potential_moved_blocks(o, match, l, + &pmb, &pmb_alloc, + &pmb_nr); + @@ diff.c: static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->emitted_symbols) { -- gitgitgadget