Re: [PATCH] gitweb: protect blob and diff output lines from controls.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
> Jakub Narebski <jnareb@xxxxxxxxx> writes:
> 
>> 1. First, esc_path should _not_ use subroutine which does it's own 
>> contol characters escaping. That was also a mistake I made in my patch.
>> Perhaps we should have some quot_html or to_html subroutine which does 
>> _only_ to_utf8 (decode from Encode module), escapeHTML and optionally 
>> s/ /&nbsp;/g conversion.
> 
> I hated that original arrangement, 

What did you hate, again?

>                                    but I do not see anything 
> obviously wrong in the output with the patch you are responding
> to.  Except that git_blame2 is missing a chomp() on "my $data"
> after finishing the metainfo loop, that is.

The original (mine) code for esc_path uses esc_html, which did it's
own partial (very partial) special characters esaping, namely
\014 (\f) => ^L, \033 (\e) => ^[. So if pathname had form feed character,
it would be replaced by ^L, not '\f'.

You have added quot_cec to esc_html subroutine directly. I don't know
what is your version of esc_html after the changes you made, but
this makes escaping part (quot) in esc_path never invoked. 

>> 2. In my opinion CS is better than CEC for quoting/escaping control 
>> characters in the "bulk" output, namely "blob" output and "text 
>> diff" (patchset body) output. CEC is better for pathnames (which must 
>> fit in one line), and perhaps other one-liners; perhaps not.
> 
> I am more for code reuse and consistency.  If "^L" is more
> readable then we should consistently use it for both contents
> and pathnames.  

Well, the pathname has the limit that it must be in single line
after quoting. The "blob" output is multipage. IMHO CEC like \n, \f,
\t are better in pathnames because this is what ls uses, while CS
for "blob" output is better because editors (including one true
editor being GNU Emacs ;-) uses CS like ^L (there is no end-of-line
as we split on LF and chomp; there is no tab character because line
is untabified first). But that is my opinion.

I think that conrol characters in filenames (in esc_path) should
be encompassed with <span class="cntrl">...</span> and styled.
I'm not sure if in "blob" view they should be styled. For sure
there should be no <span>...</span> for escaped attributed (future
esc_attr). Common to_html/quot_html would give us code reuse (as gives
quot_cec), if not consistent.

>                One of my tests were a symlink that points at a 
> funny filename ;-).

This should be IMHO solved rather by better "tree" view support
for symlinks, 'symlink' -> 'target' like in ls -l output.

>> BTW. what had happened with to_qtext post?
> 
> Sorry, I don't recall.

There was quite a bit of discussion about name of _suggested_
filename in blob_plain, blobdiff_plain view, namely the 
  -content_disposition => 'inline; filename="' ...
HTTP header. The result (probably lost in the noise) was to
add to_qtext subroutine for that.

Time to go to sleep...
-- 
Jakub Narebski
Poland
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]