Re: [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jakub Narebski <jnareb@xxxxxxxxx> writes:

> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index ec46b80..a15e916 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -563,12 +563,42 @@ sub esc_html {
>  	return $str;
>  }
>  
> +# quote unsafe characters and escape filename to HTML
> +sub esc_path {
> +	my $str = shift;
> +	$str = esc_html($str);
> +	$str =~ s/[[:cntrl:]\a\b\e\f\n\r\t\011]/?/g; # like --hide-control-chars in ls
> +	return $str;
> +}
> +

When you say "[:cntrl:]" do you need to say anything more?

I was initially puzzled by "\t\011" bit, but I realize that is a
bug that is consistent with the next part I'll comment on.

>  # git may return quoted and escaped filenames
>  sub unquote {
>  	my $str = shift;
> +
> +	sub unq {
> +		my $seq = shift;
> +		my %es = (
> +			't' => "\t", # tab            (HT, TAB)
> +			'n' => "\n", # newline        (NL)
> +			'r' => "\r", # return         (CR)
> +			'f' => "\f", # form feed      (FF)
> +			'b' => "\b", # backspace      (BS)
> +			'a' => "\a", # alarm (bell)   (BEL)
> +			#'e' => "\e", # escape        (ESC)
> +			'v' => "\011", # vertical tab (VT)
> +		);
> +
> +		# octal char sequence
> +		return chr(oct($seq))  if ($seq =~ m/^[0-7]{1,3}$/);
> +		# C escape sequence (this includes '\n' (LF) and '\t' (TAB))
> +		return $es{$seq}       if ($seq =~ m/^[abefnrtv]$/);

Problems in this part of the code X-<.

 * Was there a reason not to unwrap '\e' to "\e"?

 * The vertical tab is \013 (decimal 11), not \011 (which is TAB).

 * The name and the abbreviated name of the character "\n" are
   "line feed" and "LF"; I personally do not think these
   character name comments are needed in this part of the code,
   but I do not object if you want to have them there, as long
   as you spell them correctly. cf. ISO/IEC 6429:1992 or
   http://www.unicode.org/charts/PDF/U0000.pdf for example.

 * The hash %es and the pattern /[abef...]/ must be kept in
   sync; it is a maintenance nightmare,

 * Worse yet, they do not agree even in this initial version,
   which proves the previous point.

Perhaps this is better written as:

	if (exists $es{$seq}) {
        	return $es{$seq};
	}

The rest must have been a lot of work to sift through esc_html
and identify which is path and which is not.  Much appreciated
and expect a cleaned up patch to be applied.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]