Re: [PATCH take 3 0/4] color-words improvements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday 15 January 2009, Santi Béjar <santi@xxxxxxxxxxx> wrote 
about 'Re: [PATCH take 3 0/4] color-words improvements':
>It may be ok and logical, but for me it is not what I want. Mmaybe I
>don't really undestand what I want or is a crazy idea but here it is
>anyway:

The discussion above is mildly theoretical.  I don't imagine someone is 
going to intentionally mark 98% of a file as non-words, which is basically 
what you are doing with a regex of "a+".

>a) primary words are those with alphanumerics (or a regex)

regex: [[:alnum:]]+

example words: matrix ball I a
example non-words: don't haven't

>b) secondary "words" are the other non-whitespaces characters (in this
>case "[]{} and ,"

regex: []{}[,]

example words: [ , }
example non-words: [] ball 147

>c) whitespaces are cruft.
>
>(having two regexp to specify what is a words but they cannot mix).

Combine regex with '|' to get:
[[:alnum:]]+|[]{}[,]

>If everything works as I think (it's late night :-) with the above two
> lines:
>
>matrix[a,b,c]
>matrix{d,b,c}
>
>the word diff would be
>
>matrix<RED>[<GREEN>{<RED>a<GREEN>d<RESET>,b,c<RED>]<GREEN>}<RED>

For this specific case, the regex "[^[:space:]]" by itself should work, 
although it would end up being a character-by-character diff.

The regex you built from your description "[[:alnum:]]+|[]}{[,]" would also 
give the same diff.  However:
-dont
+don't
gives a word diff of:
don't
not:
don<RED>'<RESET>t
because "'" is not recognized as part of any word it is considered 
ignorable.

There was a patch that included documentation that most users should add 
"|[^[:space:]]" to the end of their regex, to capture all non-whitespace 
characters that are not otherwise part of a word as individual, 
single-character "words".
-- 
Boyd Stephen Smith Jr.                     ,= ,-_-. =. 
bss@xxxxxxxxxxxxxxxxx                     ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy           `-'(. .)`-' 
http://iguanasuicide.net/                      \_/     

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux