Re: Difficulty with parsing colorized diff output

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Peff & Stefan, thank you for the feedback. For my purposes, I am content to rely on gitconfig to reduce the colors to something that I can parse without losing information. Since my first email I have found that `wsErrorHighlight = none` gets rid of the problematic extra green highlights in the `+` lines.

I still think that a config to differentiate log coloring from diff coloring would be worthwhile, as it would guarantee that highlighters are getting unadulterated lines from git.

I also think that bracketing every colored line with a color code before the space/plus/minus and a reset just before the newline would be smart, because then a parser can just parse line by line: if there is an SGR sequence before the space/plus/minus, then it would know to strip off the final reset. This is in contrast to how it stands now, where a context line (with no leading color) is ambiguous by itself; I have to remember from previous lines in the hunk that we have been seeing colors in order to know wether I should strip off the reset.

I agree that a machine-readable format would be nice, but regardless it would be useful to make the regular output more parser-friendly.


> On 2018-12-11, at 5:17 AM, Jeff King <peff@xxxxxxxx> wrote:
> 
> On Mon, Dec 10, 2018 at 07:26:46PM -0800, Stefan Beller wrote:
> 
>>> Context lines do have both. It's just that the default color for context
>>> lines is empty. ;)
>> 
>> The content itself can contain color codes.
>> 
>> Instead of unconditionally resetting each line, we could parse each
>> content line to determine if we actually have to reset the colors.
> 
> Good point. I don't recall that being the motivation back when this
> behavior started, but it's a nice side effect (and the more recent line
> you mentioned in emit_line_0 certainly is doing it intentionally).
> 
> That doesn't cover _other_ terminal codes, which could also make for
> confusing output, but I do think color codes are somewhat special. We
> generally send patches through "less -R", which will pass through the
> colors but show escaped versions of other codes.
> 
>> Another idea would be to allow Git to output its output
>> as if it was run through test_decode_color, slightly related:
>> https://public-inbox.org/git/20180804015317.182683-8-sbeller@xxxxxxxxxx/
>> i.e. we'd markup the output instead of coloring it.
> 
> Yeah, I think in the most general form, the problem is that colorizing
> (including whitespace highlighting) loses information within a single
> line. It would be nice to have a machine-readable format that represents
> all the various annotations (like whitespace and coloring moved bits)
> that Git computes.
> 
> -Peff





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux