Re: [PATCH] gitweb: convert from perl internal to utf8 for commitdiff_plain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yasushi SHOJI <yashi@xxxxxxxxxxxxxxxxx> writes:

> commitdiff with raw, or plain format if you are reading the code,
> doesn't convert any word from perl internal to utf8, which is set to
> charset in http header.  this cause a problem when commit includes non
> ascii code.

Nice catch. Thanks.
 
> here is a few example in the git tree:
> 
> http://git.kernel.org/?p=git/git.git;a=commitdiff_plain;h=6ba78238a824282816944550edc4297dd2808a72
> http://git.kernel.org/?p=git/git.git;a=commitdiff_plain;h=e360bebf713b6b03768c62de8b94ddf9350b0953
> http://git.kernel.org/?p=git/git.git;a=commitdiff_plain;h=9459aa77a032621a29d53605542844641cca843a

...but commit message could be improved :-)

For example:

-- >8 --
gitweb: Convert generated contents to utf8 in commitdiff_plain

If the commit message, or commit author contains non-ascii, it must be
converted from Perl internal representation to utf-8, to follow what
got declared in HTTP header.  Use to_utf8() to do the conversion.

This necessarily replaces here-doc with "print" statements.

Signed-off-by: Yasushi SHOJI <yashi@xxxxxxxxxxxxxxxxx>
Acked-by: İsmail Dönmez <ismail@xxxxxxxxxxxxx>
Acked-by: Jakub Narebski <jnareb@xxxxxxxxx>
-- >8 --

> This patch effectively revert the commitdiff plain part of the commit
> 
> 	59b9f61a3f76762dc975e99cc05335a3b97ad1f9
> 
> which converted from print to here-doc. but it doesn't
> explain why in the commit log.

Sorry about that.

IMVHO using here-doc for longer sequence of output lines is more
readable than long "print" command, or sequence of print's. But of
course if you have to parse / transform some parts of output it is
simply not possible.

> ---
>  gitweb/gitweb.perl |   12 ++++++------
>  1 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 6256641..5d9ac1d 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -5048,16 +5048,16 @@ sub git_commitdiff {
>  			-expires => $expires,
>  			-content_disposition => 'inline; filename="' . "$filename" . '"');
>  		my %ad = parse_date($co{'author_epoch'}, $co{'author_tz'});
> -		print <<TEXT;
> -From: $co{'author'}
> -Date: $ad{'rfc2822'} ($ad{'tz_local'})
> -Subject: $co{'title'}
> -TEXT
> +		print "From: " . to_utf8($co{'author'}) . "\n";
> +		print "Date: " . to_utf8($ad{'rfc2822'}) . " "
> +			       . to_utf8($ad{'tz_local'}) . "\n";

I think that date, or at least timezone would never have characters
outside US-ASCII, so to_uft8 is not really necessary, but I guess that
it is better to be safe than sorry.

> +		print "Subject: " . to_utf8($co{'title'}) . "\n";
> +
>  		print "X-Git-Tag: $tagname\n" if $tagname;
>  		print "X-Git-Url: " . $cgi->self_url() . "\n\n";
>  
>  		foreach my $line (@{$co{'comment'}}) {
> -			print "$line\n";
> +			print to_utf8($line) . "\n";
>  		}
>  		print "---\n\n";
>  	}

By the way, I guess that with new git we could just use --pretty=email
option to git-log / git-rev-list, and add X-Git-Tag and X-Git-Url at
the beginning (or insert it after headers).  Perhaps also generate
diff with the same diff command... but I think this improvement is to
be done _after_ release.

For what is worth:

Acked-by: Jakub Narebski <jnareb@xxxxxxxxx>

-- 
Jakub Narebski
Poland
ShadeHawk on #git
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux