Jeff King <peff@xxxxxxxx> writes: > But more big-picture, comparing the output of the old color words and > this implementation, there is one thing I don't like: the new one > doesn't bring together runs of additions and deletions, which can make > parsing text much easier. For example: > > $ echo This is a complete sentence. >one > $ echo Here is some totally different text. >two > > # with old implementation; /-.../ is red, /+.../ is green > $ git diff --color-words one two > ... > /-This/ /+Here/ is /-a complete sentence./+some totally different text./ > > # with this patch > $ git diff --color-words one two > ... > /-This/+Here/ is /-a/+some/ /-complete/+totally/ /-sentence./+different text./ I suspect that heavily depends on the input text. If you drop "different" in the example, the output becomes: {-This|+Here} is {-a|+some} {-complete|+totally} {-sentence.|+text.} which is totally sensible. You can get the output that is closer to the original by tweaking the definition of what a token is. You can for example define a token as "0 or more non whitespace characters followed by 1 or more whitespace characters" and then the internal diff would become ($ to show the end of line): -This $ +Here $ is $ -a $ -complete $ -sentence.$ +some $ +totally $ +different $ +text.$ which would yield on the output: {-This |+Here }is {-a complete sentence.|+some totally different text.} It's all in diff_words_tokenize(), which I kept deliberately stupid so that people can tweak it to their liking. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html