On Sun, 4 Dec 2011, Jakub Narebski wrote: > > Currently gitweb converts data it receives from git commands to Perl > internal utf8 representation via to_utf8() subroutine [...] > Each part of data must be handled separately. It is quite error prone > process, as can be seen from quite a number of patches that fix handling > of UTF-8 data (latest from Jürgen). > > > Much, much simpler would be to force opening of all files (including > output pipes from git commands) in ':utf8' mode: > > use open qw(:std :utf8); > > [Note: perhaps instead of ':utf8' it should be ':encoding(UTF-8)' > there...] > > But doing this would change gitweb behavior. [...] [...] > I don't know if people are relying on the old behavior. I guess > it could be emulated by defining our own 'utf-8-with-fallback' > encoding, or by defining our own PerlIO layer with PerlIO::via. > But it no longer be simple solution (though still automatic). I have now created simple Encode::UTF8WithFallback module, so that use Encode::UTF8WithFallback; use open IN => ':encoding(utf8-with-fallback)'; should be able to replace all calls to to_utf8() without any change in behavior; at least simple tests shows that. There however are two problems with this solution: 1. Encode::UTF8WithFallback should really be a separate Perl module in a separate file (e.g. 'gitweb/lib/Encode/UTF8WithFallback.pm'); I was not able to make it work without a separate file. This means that it very much requires the change that allows splitting gitweb into many files and/or load extra helper modules, and/or require extra non-core modules but provide and install them with gitweb if they are not available. These changes are ready, and can be find in 'gitweb/split' branch in my git.git repositories: http://repo.or.cz/w/git/jnareb-git.git https://github.com/jnareb/git 2. It turned out that the "open" pragma 1.04 from Perl v5.8.6 does not work correctly. We need at least "open" 1.06 (version 1.05 consists supposedly only of documentation-only change). Because "open" is a core Perl module (core pragma), this means that gitweb will require in practice Perl v5.8.9 at least, increasing version requirement from current v5.8.0 -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html