Yasushi SHOJI <yashi@xxxxxxxxxxxxxxxxx> writes: > commitdiff with raw, or plain format if you are reading the code, > doesn't convert any word from perl internal to utf8, which is set to > charset in http header. this cause a problem when commit includes non > ascii code. Nice catch. Thanks. > here is a few example in the git tree: > > http://git.kernel.org/?p=git/git.git;a=commitdiff_plain;h=6ba78238a824282816944550edc4297dd2808a72 > http://git.kernel.org/?p=git/git.git;a=commitdiff_plain;h=e360bebf713b6b03768c62de8b94ddf9350b0953 > http://git.kernel.org/?p=git/git.git;a=commitdiff_plain;h=9459aa77a032621a29d53605542844641cca843a ...but commit message could be improved :-) For example: -- >8 -- gitweb: Convert generated contents to utf8 in commitdiff_plain If the commit message, or commit author contains non-ascii, it must be converted from Perl internal representation to utf-8, to follow what got declared in HTTP header. Use to_utf8() to do the conversion. This necessarily replaces here-doc with "print" statements. Signed-off-by: Yasushi SHOJI <yashi@xxxxxxxxxxxxxxxxx> Acked-by: İsmail Dönmez <ismail@xxxxxxxxxxxxx> Acked-by: Jakub Narebski <jnareb@xxxxxxxxx> -- >8 -- > This patch effectively revert the commitdiff plain part of the commit > > 59b9f61a3f76762dc975e99cc05335a3b97ad1f9 > > which converted from print to here-doc. but it doesn't > explain why in the commit log. Sorry about that. IMVHO using here-doc for longer sequence of output lines is more readable than long "print" command, or sequence of print's. But of course if you have to parse / transform some parts of output it is simply not possible. > --- > gitweb/gitweb.perl | 12 ++++++------ > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 6256641..5d9ac1d 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -5048,16 +5048,16 @@ sub git_commitdiff { > -expires => $expires, > -content_disposition => 'inline; filename="' . "$filename" . '"'); > my %ad = parse_date($co{'author_epoch'}, $co{'author_tz'}); > - print <<TEXT; > -From: $co{'author'} > -Date: $ad{'rfc2822'} ($ad{'tz_local'}) > -Subject: $co{'title'} > -TEXT > + print "From: " . to_utf8($co{'author'}) . "\n"; > + print "Date: " . to_utf8($ad{'rfc2822'}) . " " > + . to_utf8($ad{'tz_local'}) . "\n"; I think that date, or at least timezone would never have characters outside US-ASCII, so to_uft8 is not really necessary, but I guess that it is better to be safe than sorry. > + print "Subject: " . to_utf8($co{'title'}) . "\n"; > + > print "X-Git-Tag: $tagname\n" if $tagname; > print "X-Git-Url: " . $cgi->self_url() . "\n\n"; > > foreach my $line (@{$co{'comment'}}) { > - print "$line\n"; > + print to_utf8($line) . "\n"; > } > print "---\n\n"; > } By the way, I guess that with new git we could just use --pretty=email option to git-log / git-rev-list, and add X-Git-Tag and X-Git-Url at the beginning (or insert it after headers). Perhaps also generate diff with the same diff command... but I think this improvement is to be done _after_ release. For what is worth: Acked-by: Jakub Narebski <jnareb@xxxxxxxxx> -- Jakub Narebski Poland ShadeHawk on #git - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html