On Sun, Sep 4, 2016 at 4:42 PM, Stefan Beller <stefanbeller@xxxxxxxxx> wrote: > When we color the diff, we'll mark moved lines with a different color. > Excellent idea. This is a very neat way to show extra information without cluttering the diff output. > This is achieved by doing a two passes over the diff. The first pass > will inspect each line of the diff and store the removed lines and the > added lines in its own hash map. > The second pass will check for each added line if that is found in the > set of removed lines. If so it will color the added line differently as > with the new `moved-new` color mode. For each removed line we check the > set of added lines and if found emit that with the new color `moved-old`. > Makes sense. > When detecting the moved lines, we cannot just rely on a line being equal, > but we need to take the context into account to detect when the moves were > reordered as we do not want to color moved but per-mutated lines. > To do that we use the hash of the preceding line. Also makes sense. > > This patch was motivated by e.g. reviewing 3b0c4200 ("apply: move > libified code from builtin/apply.c to apply.{c,h}", 2016-08-08) > Yes, this would be quite helpful. > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 0bcb679..f4f51c2 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -980,8 +980,9 @@ color.diff.<slot>:: > of `context` (context text - `plain` is a historical synonym), > `meta` (metainformation), `frag` > (hunk header), 'func' (function in hunk header), `old` (removed lines), > - `new` (added lines), `commit` (commit headers), or `whitespace` > - (highlighting whitespace errors). > + `new` (added lines), `commit` (commit headers), `whitespace` > + (highlighting whitespace errors), `moved-old` (removed lines that > + reappear), `moved-new` (added lines that were removed elsewhere). > I liked Junio's "Moved from" and "moved to" but I think moved old and moved new are ok as well. > @@ -287,6 +304,25 @@ int git_diff_basic_config(const char *var, const char *value, void *cb) > return git_default_config(var, value, cb); > } > > +static int moved_entry_cmp(const struct moved_entry *a, > + const struct moved_entry *b, > + const void *unused) > +{ > + return strcmp(a->line, b->line) && > + a->hash_prev_line == b->hash_prev_line; So we're only comparing them if they match and have a matching previous line? That seems pretty reasonable to reduce the cost of computing exact copied sequences. > + if (ecbdata->opt->color_moved) { > + int h = memhash(line, len); > + hash_prev_removed = h; > + hash_prev_added = h; > + } > } If I understand, this is to ensure that we don't keep re-hashing each line right? Thanks, Jake