On Fri, May 19, 2017 at 11:23 AM, Jonathan Tan <jonathantanmy@xxxxxxxxxx> wrote: > On Thu, 18 May 2017 12:37:46 -0700 > Stefan Beller <sbeller@xxxxxxxxxx> wrote: > > [snip] > >> Instead this provides a dynamic programming greedy algorithm that > > Not sure if this is called "dynamic programming". https://loveforprogramming.quora.com/Backtracking-Memoization-Dynamic-Programming http://stackoverflow.com/questions/3592943/difference-between-back-tracking-and-dynamic-programming Instead of doing backtracking (finding the lengthiest hunk for each line), we keep a set of potential hunks around, this sounds very much like the examples given in these links. > The first part of the commit message could probably be written more > concisely, like the following: ... > Having said that, thanks - this version is much more like what I would > expect. Thanks for giving a more concise commit message, will fix in a reroll. > >> +static int buffered_patch_line_cmp_no_ws(const struct buffered_patch_line *a, > >> +static int buffered_patch_line_cmp(const struct buffered_patch_line *a, > > Instead of having 2 versions of all the comparison functions, could the > ws-ness be passed as the keydata? No, this is misuse use of the API, peff explains: https://public-inbox.org/git/20170513085050.plmau5ffvzn6ibfp@xxxxxxxxxxxxxxxxxxxxx/ > >> +static unsigned get_line_hash(struct buffered_patch_line *line, unsigned ignore_ws) >> +{ >> + static struct strbuf sb = STRBUF_INIT; >> + >> + if (ignore_ws) { >> + strbuf_reset(&sb); >> + get_ws_cleaned_string(line, &sb); > > Memory leak here, I think. It's static, so we don't care. I can make it non-static and release the memory in a resend. > >> + return memhash(sb.buf, sb.len); >> + } else { >> + return memhash(line->line, line->len); >> + } >> +} > > [snip] > >> +static void add_lines_to_move_detection(struct diff_options *o) >> +{ >> + struct moved_entry *prev_line; > > gcc says (rightly) that this must be initialized. This is one of the last refactorings I did on this patch, moving the prev_line out of the diff_options struct (which is memset in its init), forgot to init it here. will fix. >> + int alt_flag = 0; > > Probably call this "use_alt_color" or something similar. Sounds better than alt_flag. >> + struct moved_entry *p = pmb[i]; >> + struct moved_entry *pnext = (p && p->next_line) ? >> + p->next_line : NULL; >> + if (pnext && >> + !buffered_patch_line_cmp(pnext->line, l, o)) { >> + pmb[i] = p->next_line; >> + } else { >> + pmb[i] = NULL; >> + } > > Memory leak of pmb[i] somewhere here? pmb[] holds pointers into moved)entry elements that are obtained via hashmap_get_next(hm, match), such that any pmb[] element is also part of a hashmap. When freeing the hashmap, we'll free the memory. This array doesn't own the underlying memory. >> @@ -4874,6 +5114,11 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) >> >> if (o->use_buffer) { >> + if (o->color_moved) { > > Can you just declare the two hashmaps here, so that we do not need to > put them in o? They don't seem to be used outside this block anyway. Obviously. Thanks for that pointer as well. >> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh >> index 289806d0c7..232d9ad55e 100755 >> --- a/t/t4015-diff-whitespace.sh >> +++ b/t/t4015-diff-whitespace.sh > > As for the tests, also add a test checking the interaction with > whitespace highlighting, and a test showing that diff errors out if we > ask for both move coloring and word-by-word diffing. We do not error out, but ignore the move heuristic doesn't find any blocks. I can make it error out, instead. (and add tests) Thanks, Stefan