Jakub Narebski <jnareb@xxxxxxxxx> writes: > On Wed, 7 Dec 2011, Junio C Hamano wrote: >> I think we added and you acked 00f429a (gitweb: Handle non UTF-8 text >> better, 2007-06-03) for a good reason, and I think the above argues that >> whatever issue the commit tried to address is a non-issue. Is it really >> true? > > I think that UTF-8 is much more prevalent character encoding in operating > systems, programming languages, APIs, and software applications than it > was 4 years ago. Yeah, that was the kind of "reasoning behind it" explanation I was hoping to see spelled out for people to agree or disagree. But then the updated gitweb won't have trouble showing history of some projects that has 4 yours or longer history (hopefully Git itself is not among them). > The proposed > > use open qw(:std :utf8); > > and removal of to_utf8 and $fallback_encoding would be regression compared > to post-00f429a... but the tradeoff of more robust UTF-8 handling might be > worth it. > >> > ... I guess >> > it could be emulated by defining our own 'utf-8-with-fallback' >> > encoding, or by defining our own PerlIO layer with PerlIO::via. >> > But it no longer be simple solution (though still automatic). >> >> Between the current "everybody who read from the input must remember to >> call to_utf8" and "everybody gets to read utf8 automatically for internal >> processing", even though the latter may involve one-time pain to set up >> the framework to do so, the pros-and-cons feels obvious to me. > > There is also a matter of performance (':utf8' and ':encoding(UTF-8)' > are AFAIK implemented in C, both the Encode part and PerlIO part). Would a reasonable first step be to replace the calls to bare "open" with a wrapper that simulates the "open" interface (e.g. "sub git_open"), but still keep the same behaviour as post-00f429a that could be much slower than the native one? Then a separate patch can build a "regression but uses native and much faster" alternative on top, no? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html