Re: [RFC] Print diffs of UTF-16 to console / patches to email as UTF-8...?

Jonathan Nieder <jrnieder@xxxxxxxxx> · Fri, 22 Oct 2010 12:30:55 -0500

Drew Northup wrote:

> Please forgive me for being offended that UTF-16 text is not "generic"
> enough.

First some words of explanation.

By "generic" I did not mean ubiquitous, unbranded, popular, or some
other almost-synonym.  What I actually meant is that it is not obvious
what to do with UTF-16.  Should it be converted to UTF-8 for output?
Should it always be normalized when added to the index, so that
switching between canonically equivalent sequences does not result
in spurious diffs?  Should the byte-for-byte representation be
faithfully preserved, even when it is not valid UTF-16?

When in such a situation, often a good approach is the following:
take care of mechanism first, then policy.  So the first thing to do
is to make sure that the code is _capable_ of what people are trying
to do; then one can try various configurations and see what is most
convenient; and finally, one can make sure the program behaves in an
intuitive way by setting a reasonable default.

So by "generic" I meant those mechanisms that can be used in the
context of multiple policies.

Apologies; I never meant to offend; please carry on and I will leave
you in peace.

Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html