Re: [PATCH] make --color-words separate word on ispunct

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



El sáb, 12-04-2008 a las 23:32 +0800, Ping Yin escribió:
> On Sat, Apr 12, 2008 at 11:23 PM, Johannes Schindelin
> <Johannes.Schindelin@xxxxxx> wrote:
> > Hi,
> >
> >
> >  On Sat, 12 Apr 2008, sgala@xxxxxxxxxxxx wrote:
> >
> >  > Note that this may actually be harmful when trying to spot punctuation
> >  > changes, but for this use case I don't think color-words is helping now
> >  > either.
> >
> >  I do not know how commonly supported ispunct(), therefore I do not like
> >  the patch too much.
> >

I didn't like the patch that much either, but at least it was a quick
proof of concept. :)

re: support of ispunct, ispunct checks, according to the linux man page,
for:

any printable character which is not a  space  or  an alphanumeric
character.

so isspace(c) || ispunct(c) -> isprint(c) && !isalnum(c)

> >  Besides, since long ago I want to make the list of boundary characters
> >  configurable, preferably as a tr(1) style list, but I have not come around
> >  to do that yet.
> >

That would be cool, it was my first thought until I saw this "easy try".
But I'm not a C programmer, I was just trying to spot the correctness of
a few name additions in lines of comma separated ids of 100 names or
something like that. The patch I sent is not perfect, but achieved 80%
of what I wanted with 10 minutes of effort (including build, test and
sending the patch).

On the other hand, while --color-words is very useful for text or
detecting typos, with big text changes it sometimes gives worse results
than --color, see for instance, on the git repo, the second hunk of

git diff --stat -p --color-words
f59774add488a6c5fb440a4aaa7255f594b1027d^ -- builtin-fetch.c

(and just --color) Not sure how to fix it, or, ideally, having some
automated way to switch between line-oriented coloring and word-oriented
coloring depending of density of changes.

> 
> It is so good an idea. I look forward to it. Futher, should
> --color-words support
> multibyte characters where every character is a boundary?
> 

This would require more changes, to the
iswspace/iswpunct/iswprint/iswalnum functions, with associated change
from chars to wide chars.

Regards
Santiago


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux