Shin Kojima <shin@xxxxxxxxxx> writes: > Some multi-byte character encodings (such as Shift_JIS and GBK) have > characters whose final bytes is an ASCII '\' (0x5c), and they > will be displayed as funny-characters even if $fallback_encoding is > correct. This is because `highlight` command always expects UTF-8 > encoded strings from STDIN. > > $ echo 'my $v = "申";' | highlight --syntax perl | w3m -T text/html -dump > my $v = "申"; > > $ echo 'my $v = "申";' | iconv -f UTF-8 -t Shift_JIS | highlight \ > --syntax perl | iconv -f Shift_JIS -t UTF-8 | w3m -T text/html -dump > > iconv: (stdin):9:135: cannot convert > my $v = " > > This patch prepare git blob objects to be encoded into UTF-8 before > highlighting in the manner of `to_utf8` subroutine. > --- The single liner Perl invoked from the script felt a bit too dense to my taste but other than that I have no complaints to what the patched code does. Jakub, does it look good to you, too? Please sign-off your patch (see Documentation/SubmittingPatches). Thanks. > gitweb/gitweb.perl | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 05d7910..2fddf75 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -3935,6 +3935,9 @@ sub run_highlighter { > > close $fd; > open $fd, quote_command(git_cmd(), "cat-file", "blob", $hash)." | ". > + quote_command($^X, '-CO', '-MEncode=decode,FB_DEFAULT', '-pse', > + '$_ = decode($fe, $_, FB_DEFAULT) if !utf8::decode($_);', > + '--', "-fe=$fallback_encoding")." | ". > quote_command($highlight_bin). > " --replace-tabs=8 --fragment --syntax $syntax |" > or die_error(500, "Couldn't open file or run syntax highlighter"); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html