Hi, On Sun, 11 Jan 2009, Thomas Rast wrote: > Johannes Schindelin wrote: > > > > But at least _I_ think it is easy to follow, and it actually makes the code > > more readable/hackable. Correct me if I'm wrong. > > It indeed seems a sane approach. Thanks. > However, the final result segfaults and/or prints garbage (on > apparently every commit except very small changes) when using the regex > '\S+', which IMHO should give exactly the same result as not using a > regex at all. No, it should not. The correct regex is '^\S+'. As it happens, your regex matches _anything_ + non-whitespace. Unfortunately, this includes a newline which utterly confuses the diff, and therefore the code that tries to get the true offsets. Consequently, it crashes. > Plain --color-words is not affected. Of course, I did not change anything outside the code path of --color-words. Ciao, Dscho -- snipsnap -- [PATCH] color-words: \n must not be a part of the word. Allowing \n as part of a word is a pilot error, but that is not a reason for the code to crash. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@xxxxxx> --- diff.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index d6bba72..676eb79 100644 --- a/diff.c +++ b/diff.c @@ -381,8 +381,10 @@ static int find_word_boundary(mmfile_t *buffer, int i, regex_t *word_regex) if (word_regex) { regmatch_t match[1]; - if (!regexec(word_regex, buffer->ptr + i, 1, match, 0)) - i += match[0].rm_eo; + if (!regexec(word_regex, buffer->ptr + i, 1, match, 0)) { + char *p = memchr(buffer->ptr + i, '\n', match[0].rm_eo); + i = p ? p - buffer->ptr : match[0].rm_eo + i; + } } else while (i < buffer->size && !isspace(buffer->ptr[i])) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html