Johannes Schindelin wrote: > On Fri, 9 Jan 2009, Thomas Rast wrote: > > > Allows for user-configurable word splits when using --color-words. This > > can make the diff more readable if the regex is configured according to > > the language of the file. > > > > For now the (POSIX extended) regex must be set via the environment > > GIT_DIFF_WORDS_REGEX. Each (non-overlapping) match of the regex is > > considered a word. Anything characters not matched are considered > > whitespace. For example, for C try > > > > GIT_DIFF_WORDS_REGEX='[0-9]+|[a-zA-Z_][a-zA-Z0-9_]*|(\+|-|&|\|){1,2}|\S' [...] > Interesting idea. However, I think it would be better to do the opposite, > have _word_ patterns. And even better to have _one_ pattern. I'm not sure I understand. It _is_ a single pattern. The examples just have several cases to distinguish various semantic groups that can occur, as a sort of "half tokenizer". (The C example isn't very complete however.) > BTW I think you could do what you intended to do with a _way_ smaller > and more intuitive patch. How? I don't think the existing mechanism, which just replaces all whitespace with newlines and does a line diff to find out which words changed, can "just" be adapted. We will have to insert extra newlines at points where the regex said to split a word, but where there was no whitespace in the original content. If there's a significantly easier way to do that than I hacked up, please share. Or maybe I got your original code all wrong? -- Thomas Rast trast@{inf,student}.ethz.ch
Attachment:
signature.asc
Description: This is a digitally signed message part.