gitweb - encoding problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I use ISO-8859-1 as my locale, so my blobs, commits and tags are in
this encoding.

On perl v5.8.6, decode_utf8 of any non utf-8 value returns undefined:

$cat xy
#!/usr/bin/perl
use strict;
use warnings;
use CGI qw(:standard :escapeHTML -nosticky);
use CGI::Util qw(unescape);
use CGI::Carp qw(fatalsToBrowser);
use Encode;
use Fcntl ':mode';
use File::Find qw();
use File::Basename qw(basename);

binmode STDOUT, ':utf8';

print decode_utf8('äöü');
$perl xy
[Mon May 21 22:00:00 2007] xy: Use of uninitialized value in print at xy line 14.

If gitweb encounters, eg. an "Umlaut" (äöü) in a commit/tag, use of
uninitialized value message are generated. In one case,
decode_utf8($long) in format_subject_html is undefined, which results
in a invalid link (a tag contains only title without any value
assignment) and a browser message, that the html is not valid.

The previous installed version of git/gitweb (1.5.0rc3) showed only
small black rhombuses, but didn't generate "uninitialized value"
messages or invalid html.

So there is regression between git-1.5.0 and git-1.5.2.

Adding $var = encode_utf8($var) if (!defined decode_utf8($var)) for
each "uninitialized value" message results in a correct result for me.

I wanted to post a patch with these changes, as it solved my locale problem.
But then I tried the same a different computer with a newer perl (v5.8.8).
$ cat x
#!/usr/bin/perl
use strict;
use warnings;
use CGI qw(:standard :escapeHTML -nosticky);
use CGI::Util qw(unescape);
use CGI::Carp qw(fatalsToBrowser);
use Encode;
use Fcntl ':mode';
use File::Find qw();
use File::Basename qw(basename);

binmode STDOUT, ':utf8';

print decode_utf8('äöü');
$ perl x
ᅵᅵï¿

Here perl decodes the ISO-8859-1 text to something differnent:
00000000  ef bf bd ef bf bd ef bf  bd                       |ᅵᅵï¿

The result is, that all "Umlaute" are shown as a small black rhombus
in gitweb (and no invalid html).

mfg Martin Kögler

cat x |hexdump -C
00000000  23 21 2f 75 73 72 2f 62  69 6e 2f 70 65 72 6c 0a  |#!/usr/bin/perl.|
00000010  75 73 65 20 73 74 72 69  63 74 3b 0a 75 73 65 20  |use strict;.use |
00000020  77 61 72 6e 69 6e 67 73  3b 0a 75 73 65 20 43 47  |warnings;.use CG|
00000030  49 20 71 77 28 3a 73 74  61 6e 64 61 72 64 20 3a  |I qw(:standard :|
00000040  65 73 63 61 70 65 48 54  4d 4c 20 2d 6e 6f 73 74  |escapeHTML -nost|
00000050  69 63 6b 79 29 3b 0a 75  73 65 20 43 47 49 3a 3a  |icky);.use CGI::|
00000060  55 74 69 6c 20 71 77 28  75 6e 65 73 63 61 70 65  |Util qw(unescape|
00000070  29 3b 0a 75 73 65 20 43  47 49 3a 3a 43 61 72 70  |);.use CGI::Carp|
00000080  20 71 77 28 66 61 74 61  6c 73 54 6f 42 72 6f 77  | qw(fatalsToBrow|
00000090  73 65 72 29 3b 0a 75 73  65 20 45 6e 63 6f 64 65  |ser);.use Encode|
000000a0  3b 0a 75 73 65 20 46 63  6e 74 6c 20 27 3a 6d 6f  |;.use Fcntl ':mo|
000000b0  64 65 27 3b 0a 75 73 65  20 46 69 6c 65 3a 3a 46  |de';.use File::F|
000000c0  69 6e 64 20 71 77 28 29  3b 0a 75 73 65 20 46 69  |ind qw();.use Fi|
000000d0  6c 65 3a 3a 42 61 73 65  6e 61 6d 65 20 71 77 28  |le::Basename qw(|
000000e0  62 61 73 65 6e 61 6d 65  29 3b 0a 0a 62 69 6e 6d  |basename);..binm|
000000f0  6f 64 65 20 53 54 44 4f  55 54 2c 20 27 3a 75 74  |ode STDOUT, ':ut|
00000100  66 38 27 3b 0a 0a 70 72  69 6e 74 20 64 65 63 6f  |f8';..print deco|
00000110  64 65 5f 75 74 66 38 28  27 e4 f6 fc 27 29 3b 0a  |de_utf8('äöü');.|
00000120
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux