On Mon, Sep 26, 2016 at 06:22:10PM -0700, Junio C Hamano wrote: > @@ -501,17 +516,9 @@ char *reencode_string_len(const char *in, int insz, > > conv = iconv_open(out_encoding, in_encoding); > if (conv == (iconv_t) -1) { > - /* > - * Some platforms do not have the variously spelled variants of > - * UTF-8, so let's fall back to trying the most official > - * spelling. We do so only as a fallback in case the platform > - * does understand the user's spelling, but not our official > - * one. > - */ > - if (is_encoding_utf8(in_encoding)) > - in_encoding = "UTF-8"; > - if (is_encoding_utf8(out_encoding)) > - out_encoding = "UTF-8"; > + in_encoding = fallback_encoding(in_encoding); > + out_encoding = fallback_encoding(out_encoding); > + This comment is interesting. We're concerned about a platform knowing "utf8" but not "UTF-8". When we fallback, we do it for both the input and output encodings, because we don't know which may have caused the problem. So is it possible that we improve one case but break the other? With just UTF-8, I don't think so. That could only be the case with something like "utf8 -> utf-8" because they both become "UTF-8". So either it improves the situation or not (because we either understand UTF-8 or not). But once we introduce other fallbacks, then "utf8 -> latin1" may become "UTF-8 -> iso8859-1". A system that knows only "utf8" and "iso8859-1" _could_ work if we turned the knobs individually, but won't if we turn them both at once. Worse, a system that knows only "UTF-8" and "latin1" works now, but would break with your patches. I'm not convinced it's worth worrying about, though. The existence of such a system is theoretical at this point. I'm not even sure how common the "know about utf8 but not UTF-8" thing is, or if we were merely being overly cautious. -Peff