Re: [PATCH 2/2] Make it possible to update git_wcwidth()

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 12 May 2014 10:44:05 -0700

Peter Krefting <peter@xxxxxxxxxxxxxxxx> writes:

> Torsten Bögershausen:
>
>> The function git_wcwidth() returns for a given unicode code point the
>> width on the display:
>> -1 for control characters,
>> 0 for combining or other non-visible code points
>> 1 for e.g. ASCII
>> 2 for double-width code points.
>
> This all looks sane, but the problem is that this is also
> context-dependent since there are a lot of characters with "ambiguous"
> widths, i.e., characters that are "double" width for CJK locales (and
> fonts), but "single" width for others. This includes Greek and
> Cyrillic characters, which are encoded using the double-byte parts of
> the CJK DBCS encodings.
>
> I'm not quite sure how much impact this would have on day-to-day Git
> operation in a CJK locale, however, as I guess they would mostly
> encounter texts in their own language (which would mostly be "double"
> width) or English (which would be unambiguously "single" width).
>
> Anyone on the list running Git in a CJK locale that would like to
> weigh in here?

The issue does appear in the real life.  A solution I've seen used
in a terminaul emulator program was to give the user a choice to say
"I want ambiguous ones to be treated as double (or single)".  As a
J-locale user, I naturally set the configuration to double while
using that program (I no longer do).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html