On Tue, 24 Apr 2018 16:23:28 -0700 Stefan Beller <sbeller@xxxxxxxxxx> wrote: > >> s/missmatch/mismatch/ > >> Also, what is this used for? > > > > The mismatch should be used for (thanks for catching!) > > checking if the remainder of a line is the same, although a boolean > > may be not the correct choice. We know that the two strings > > passed into compute_ws_delta come from a complete white space > > agnostic comparison, so consider: > > > > + SP SP more TAB more > > + SP SP text TAB text > > > > - SP more TAB more > > - SP text TAB text > > > > which would mark this as a moved block. This is the feature > > working as intended, but what about > > > > + SP SP more TAB more > > + SP SP text TAB text > > > > - SP more SP more > > - SP text TAB text > > > > Note how the length of the strings is the same, hence the current > > code of > > > > compute_ws_delta(...) { > > int d = longer->len - shorter->len; > > out->string = xmemdupz(longer->line, d); > > } > > > > would work fine and not notice the "change in the remainder". > > That ought to be caught by the mismatch variable, that > > is assigned, but not used. > > > > The compare_ws_delta(a, b) needs to be extended to > > > > !a->mismatch && !b->mismatch && existing_condition > > > > Ideally the example from above would be different depending > > on whether the other white space flags are given or not. Thanks - this gives me food for thought. I'm starting to think that it is impossible to avoid creating our own string comparison function that: - seeks to the first non-whitespace character in both strings - checks that both strings, from that first non-whitespace characters, are equal for some definition of equal (whether through strcmp or xdiff_compare_lines) - walks backwards from that first non-whitespace characters to look for the first non-matching whitespace character between the 2 strings The existing diff whitespace modes (to be passed to xdiff_compare_lines) do not give the exact result we want. For example, if XDF_IGNORE_WHITESPACE is used (as is in this patch), lines like "a b" and "ab " would match even though they shouldn't. This comparison function can be used both in moved_entry_cmp() and when constructing the ws_delta (in which case it should be made to output whatever information is needed as out parameters or similar). > >>> + if (diffopt->color_moved_ws_handling & COLOR_MOVED_DELTA_WHITESPACES) > >>> + /* > >>> + * As there is not specific white space config given, > >>> + * we'd need to check for a new block, so ignore all > >>> + * white space. The setup of the white space > >>> + * configuration for the next block is done else where > >>> + */ > >>> + flags |= XDF_IGNORE_WHITESPACE; > >>> + > >>> return !xdiff_compare_lines(a->es->line, a->es->len, > >>> b->es->line, b->es->len, > >>> flags); We can't add XDF_IGNORE_WHITESPACE here - as far as I can tell, this means that more lines will be treated as moved than the user wants (if the user did not set --color-moved-ignore-all-space). > >> It seems like pmb and wsd are parallel arrays - could each wsd be > >> embedded into the corresponding entry of pmb instead? > > > > I'll explore that. It sounds like a good idea for code hygiene. > > Although if you do not intend to use this feature, then keeping it separate > > would allow for a smaller footprint in memory. If you're worried about memory, wsd can be embedded as a pointer. > > The command is missing a --color-moved, as the --color-moved-whitespace-settings > > do not imply --color-moved, yet(?) Maybe one should imply the other (or at least a warning).