Re: [PATCH/RFC] gitweb: highlight: strip non-printable characters via col(1)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 16 Aug 2011 at 1:30pm, J.H. wrote:

> On 08/16/2011 11:16 AM, Christopher M. Fuhrman wrote:
> > From: "Christopher M. Fuhrman" <cfuhrman@xxxxxxxxx>
> >
> > The current code, as is, passes control characters, such as form-feed
> > (^L) to highlight which then passes it through to the browser.  This
> > will cause the browser to display one of the following warnings:
> >

<snip>

> > Strip non-printable control-characters by piping the output produced
> > by git-cat-file(1) to col(1) as follows:
> >
> >   git cat-file blob deadbeef314159 | col -bx | highlight <args>
>
> So my only real concern here is that `col` itself is going to munge
> whitespace.  Quoting from the col man page:
>
> 	[...] and replaces white-space characters with tabs where
> 	    possible. [...]

I figured that would be a concern which is why I added the -x option.
>From the col(1) man page:

  -x        Output multiple spaces instead of tabs.

I also took a diff between two XHTML files.  One that used col -bx and one
that didn't.  Here's the results:

--- withoutcol.xhtml	2011-08-16 14:11:39.000000000 -0700
+++ withcol.xhtml	2011-08-16 14:11:26.000000000 -0700
@@ -52,7 +52,7 @@
 <span class="hl dir"># define DBG_CFG(args)</span>
 <span class="hl dir">#endif</span>

-
+
 <span class="hl com">/*</span>
 <span class="hl com"> * Routines to access TIG registers.</span>
 <span class="hl com"> */</span>
@@ -76,7 +76,7 @@
         <span class="hl sym">*</span>tig_addr <span class="hl sym">= (</span><span class="hl kwb">unsigned long</span><span class="hl sym">)</span>value<span class="hl sym">;</span>
 <span class="hl sym">}</span>

-
+
 <span class="hl com">/*</span>
 <span class="hl com"> * Given a bus, device, and function number, compute resulting</span>
 <span class="hl com"> * configuration space address</span>
@@ -197,7 +197,7 @@
         <span class="hl sym">.</span>write <span class="hl sym">=</span>        titan_write_config<span class="hl sym">,</span>
 <span class="hl sym">};</span>

(remainder stripped)

>
> Have you actually run into a situation where something like ^L was
> present in a blob that was being passed to highlight?
>

I've seen ^L is the Linux kernel source tree as well as the NetBSD src
tree.  I've not encountered it elsewhere although I would think it would
be present depending on personal/corporate coding preferences.

> - John

Cheers!

-- 
Chris Fuhrman
cfuhrman@xxxxxxxxx


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]