Re: [RFD] Handling of non-UTF8 data in gitweb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 4 Dec 2011, Jakub Narebski wrote:
> 
> Currently gitweb converts data it receives from git commands to Perl 
> internal utf8 representation via to_utf8() subroutine
[...]
> Each part of data must be handled separately.  It is quite error prone
> process, as can be seen from quite a number of patches that fix handling
> of UTF-8 data (latest from Jürgen).
> 
> 
> Much, much simpler would be to force opening of all files (including 
> output pipes from git commands) in ':utf8' mode:
> 
>   use open qw(:std :utf8);
> 
> [Note: perhaps instead of ':utf8' it should be ':encoding(UTF-8)' 
>  there...]
> 
> But doing this would change gitweb behavior.  [...]
[...]
> I don't know if people are relying on the old behavior.  I guess
> it could be emulated by defining our own 'utf-8-with-fallback'
> encoding, or by defining our own PerlIO layer with PerlIO::via.
> But it no longer be simple solution (though still automatic).

I have now created simple Encode::UTF8WithFallback module, so that

  use Encode::UTF8WithFallback;
  use open IN => ':encoding(utf8-with-fallback)';

should be able to replace all calls to to_utf8() without any change
in behavior; at least simple tests shows that.


There however are two problems with this solution:

1. Encode::UTF8WithFallback should really be a separate Perl module
   in a separate file (e.g. 'gitweb/lib/Encode/UTF8WithFallback.pm');
   I was not able to make it work without a separate file.

   This means that it very much requires the change that allows splitting
   gitweb into many files and/or load extra helper modules, and/or require
   extra non-core modules but provide and install them with gitweb if they
   are not available.  These changes are ready, and can be find in 

     'gitweb/split'
   
   branch in my git.git repositories:

     http://repo.or.cz/w/git/jnareb-git.git
     https://github.com/jnareb/git


2. It turned out that the "open" pragma 1.04 from Perl v5.8.6 does not
   work correctly.  We need at least "open" 1.06 (version 1.05 consists
   supposedly only of documentation-only change).

   Because "open" is a core Perl module (core pragma), this means that
   gitweb will require in practice Perl v5.8.9 at least, increasing
   version requirement from current v5.8.0
 
-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]