On Mon, 30 Aug 2010, Ævar Arnfjörð Bjarmason wrote:
On Sun, Aug 29, 2010 at 20:45, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:
A would be preferred for correctness, and with a fallback BSD printf()
we can avoid the GNU libc bug, however that'll mean using LC_CTYPE,
which'll have some small side-effects for the rest of the code.
The real problem is that you are probably using same functions
(locale-enable) for the user-facing side as well as for the
backend (talking to repository). Some projects decided to use
some special encoding internally (like UCS-2 in case of Java
and Python 2.x, Unicode ordinals in Python 3.x). Otherwise
you may end up in some incompatibilities in the on-disk on
on-network format. I don't think you want to keep telling all bug
reporters for few years - "Can you try that again with env LANG=C,
please?" :)
Bringing Unicode onboard means that simple strlen() is no longer
what you normally think it does.
On Mon, 30 Aug 2010, Jonathan Nieder wrote:
Ævar Arnfjörð Bjarmason wrote:
We can even keep the "Content-Type: text/plain; charset=UTF-8\n" and
*not* use LC_CTYPE if we add a bind_textdomain_codeset("git", "UTF-8")
call to gettext.
Oh! I'd personally prefer to do that for now. :) (Not because of the
known printf problem but because I like to reduce possible unknowns.)
Well, in this case everybody will be force to have UTF-8 in output
on-screen, not useful for people using ISO8859-*, KOI8-R and similar
things...
--Marcin