Re: [PATCH/RFC 1/n] gitweb: Better git-unquoting and gitweb-quoting of pathnames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
> Jakub Narebski <jnareb@xxxxxxxxx> writes:
> 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index ec46b80..a15e916 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -563,12 +563,42 @@ sub esc_html {
> >  	return $str;
> >  }
> >  
> > +# quote unsafe characters and escape filename to HTML
> > +sub esc_path {
> > +	my $str = shift;
> > +	$str = esc_html($str);
> > +	$str =~ s/[[:cntrl:]\a\b\e\f\n\r\t\011]/?/g; # like --hide-control-chars in ls
> > +	return $str;
> > +}
> > +
> 
> When you say "[:cntrl:]" do you need to say anything more?

Ooops. Yes, the \a\b\e\f\n\r\t\011 part is redundant.

> >  # git may return quoted and escaped filenames
> >  sub unquote {
> >  	my $str = shift;
> > +
> > +	sub unq {
> > +		my $seq = shift;
> > +		my %es = (
> > +			't' => "\t", # tab            (HT, TAB)
> > +			'n' => "\n", # newline        (NL)
> > +			'r' => "\r", # return         (CR)
> > +			'f' => "\f", # form feed      (FF)
> > +			'b' => "\b", # backspace      (BS)
> > +			'a' => "\a", # alarm (bell)   (BEL)
> > +			#'e' => "\e", # escape        (ESC)
> > +			'v' => "\011", # vertical tab (VT)
> > +		);
> > +
> > +		# octal char sequence
> > +		return chr(oct($seq))  if ($seq =~ m/^[0-7]{1,3}$/);
> > +		# C escape sequence (this includes '\n' (LF) and '\t' (TAB))
> > +		return $es{$seq}       if ($seq =~ m/^[abefnrtv]$/);
> 
> Problems in this part of the code X-<.
> 
>  * Was there a reason not to unwrap '\e' to "\e"?

It was not mentioned in description of git pathname quoting in the
message
  http://marc.theaimsgroup.com/?l=git&m=112927316408690&w=2;

>  * The vertical tab is \013 (decimal 11), not \011 (which is TAB).

Oops. ASCII 11 is decimal 11.

>  * The name and the abbreviated name of the character "\n" are
>    "line feed" and "LF"; I personally do not think these
>    character name comments are needed in this part of the code,
>    but I do not object if you want to have them there, as long
>    as you spell them correctly. cf. ISO/IEC 6429:1992 or
>    http://www.unicode.org/charts/PDF/U0000.pdf for example.

Perhaps it should be "LF ('\n') and TAB ('\t')".

>  * The hash %es and the pattern /[abef...]/ must be kept in
>    sync; it is a maintenance nightmare,
> 
>  * Worse yet, they do not agree even in this initial version,
>    which proves the previous point.
> 
> Perhaps this is better written as:
> 
> 	if (exists $es{$seq}) {
>         	return $es{$seq};
> 	}

Fact.

Or
	return $es{$seq} if exists $es{$seq};

-- 
Jakub Narebski
Poland
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]