Re: [PATCH] git-blame: Make the output human readable

Sergey Vlasov <vsu@xxxxxxxxxxx> · Wed, 8 Mar 2006 17:32:49 +0300

On 6 Mar 2006 14:33:26 -0500 linux@xxxxxxxxxxx wrote:

> Well, getting 15 characters in UTF-8 is easy (just stop before the 16th
> byte for which ((b & 0xc0) != 0x80)), but what about combining characters?
> 
> You've got accents and stuff to worry about.  And the annoying fact that
> Unicode defined accents as suffixes, so you have to go past the 15th
> column to include all of the 
> 
> And then there's that fact that many characters are traditionally
> represented as double-wide forms, even on character terminals.
> 
> See http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c for details
> an an example implementation of wcwidth().
> 
[skip]
> 			/* Now find the width of it */
> 			w = wcwidth(c);

And this won't work, unless you also add that wcwidth() implementation
to git.

The problem is that the wchar_t encoding is not specified anywhere -
glibc uses Unicode for it, but other systems can use whatever they want
(even locale-dependent).
Attachment:
pgpWkkgHSwmZT.pgp

Description: PGP signature