Jakub Narębski <jnareb@xxxxxxxxx> writes: > W dniu 09.04.2013 19:40, Jürgen Kreileder napisał: >> Jakub Narębski <jnareb@xxxxxxxxx> writes: >>> Jürgen Kreileder wrote: >>> >>>> Properly encode site and project names for RSS and Atom feeds. > >>>> - my $title = "$site_name - $project/$action"; >>>> + my $title = to_utf8($site_name) . " - " . to_utf8($project) . "/$action"; > >>> Was this patch triggered by some bug? >> >> Yes, I actually see broken encoding with the old code, e.g on >> https://git.blackdown.de/old.cgi?p=contactalbum.git;a=rss >> my first name is messed up in the title tag. >> >> New version: https://git.blackdown.de/?p=contactalbum.git;a=rss >> >>> Because the above is not necessary, as git_feed() has >>> >>> $title = esc_html($title); >>> >>> a bit later, which does to_utf8() internally. >> >> Good point. But it doesn't fix the string in question: >> It looks like to_utf8("$a $b") != (to_utf8($a) . " " . to_utf8($b)). > > Strange. I wonder if the bug is in our to_utf8() implementation, > or in Encode, or in Perl... and whether this bug can be triggered > anywhere else in gitweb. I don't think it's a bug, more like a consequence of concatenating utf8 and non-utf8 strings: my $a = "ü"; my $b = "ü"; my $c = "$a - $b"; print "$c -> ". to_utf8($c) . ": " . (utf8::is_utf8($c) ? "utf8" : "not utf8") . "\n"; # GOOD $b = to_utf8($b); $c = "$a - $b"; print "$c -> ". to_utf8($c) . ": " . (utf8::is_utf8($c) ? "utf8" : "not utf8") . "\n"; # GOOD yields (hopefully the broken encoding shows up correctly here): ü - ü -> ü - ü: not utf8 ü - ü -> ü - ü: utf8 In gitweb we have the bad case: my $title = "$site_name - $project/$action"; $project and $action are apparently utf8 already but $site_name isn't. The resulting string is marked as utf8 - although the encoding of $site_name was never fixed. The to_utf8() in esc_html() returns the string without fixing anything because of that. > What Perl version and Encode module version do you use? 5.14.2 and 2.42_01 on Ubuntu. Same results with 5.12.4 and 2.39 on OS X. Juergen -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html