Jakub Narebski <jnareb@xxxxxxxxx> wrote: > A few issues I have just noticed about this series. > > > First, about naming. "Highlighting interesting parts of diff" is > acceptable name, but "syntax highlighting for diff" is not: gitweb > already use syntax highlighting in diff views. Call it "diff refinement > highlighting", "highlighting changes / changed sections", "intraline > highlighting". OK, I'll make sure the naming is correct and consistent. > > > Second, I think (but I am not sure) that there is a bug in code that > finds common suffix and prefix. > > If I understand correctly the idea is to highlight changed part if > there is at least one of common non-whitespace suffix or prefix. > So syntax highlighting should look like this: > > 1. Both prefix and suffix are non empty and non whitespace only > > -foo -{bar} baz > +foo +{quux} baz > > 2. Non empty and non whitespace only prefix > > -foo -{bar} > +foo +{quux} > > 2. Non empty and non whitespace only suffix > > --{bar} baz > ++{quux} baz > > But in your code $prefix is not the length of common prefix, but > the position of end of prefix in the original line of diff. So > you start with $prefix = 1... even though the prefix is empty. > > How is that supposed to work? But see the check: + # Mark lines that are different from each other, but have some common + # part that isn't whitespace. If lines are completely different, don't + # mark them because that would make output unreadable, especially if + # diff consists of multiple lines. + if (($prefix == 1 && $suffix_rem == $#r && $suffix_add == $#a) ^ | This part. + || ($prefix_is_space && $suffix_is_space)) { + $esc_rem = esc_html($rem); + $esc_add = esc_html($add); + } else { + $esc_rem = esc_html_mark_range(\@r, $prefix, $suffix_rem); + $esc_add = esc_html_mark_range(\@a, $prefix, $suffix_add); + } I guess it's still not correct because it should be equal to number of parents or $prefix_length should start from 0 like you wrote later. > > > On Mon, 13 Feb 2012, Michał Kiedrowicz wrote: > > Jakub Narebski <jnareb@xxxxxxxxx> wrote: > > > On Mon, 13 Feb 2012, Michal Kiedrowicz wrote: > > > > > I haven't found *examples* on GitHub and Trac sites, but what about > > > > these ones: > > > > > > > > https://github.com/gitster/git/commit/8cad4744ee37ebec1d9491a1381ec1771a1ba795 > > > > http://trac.edgewall.org/changeset/10973 > [...] > > > BTW GitHub is closed source, but we can check what algorithm does Trac > > > use for diff refinement highlighting (highlighting changed portions of > > > diff). > > > > > > > I think it's > > http://trac.edgewall.org/browser/trunk/trac/versioncontrol/diff.py > > (see markup intraline_changes()). > > It is get_change_extent() that finds extent of changes, as a pair > containing the offset at which the changes start, and the negative offset > at which the changes end. So it is the same solution you use, only > without ignoring whitespace-only prefixes and suffixes... This code can > be easily ported to Perl, BTW. > > The markup_intraline_changes() function compares lines from preimage and > from postimage pairwise, requiring that number of lines matches, the same > like in your algorithm. > So using Jeff's diff-highlight we remain quite consistent with Trac output. There's nothing we can "steal" from it. > > > > I would also consider ignoring prefixes/suffixes with punctuation, like: > > > > > > > > - * I like you. > > > > + * Alice had a little lamb. > > > > > > But this patch doesn't implement this feature yet, isn't it? > > > > No, but is a matter of adding > > > > -$prefix_is_space = 0 if ($r[$prefix] !~ /\s/); > > +$prefix_is_space = 0 if ($r[$prefix] !~ /\s|[[:punct:]]/); > > > > (and the same for suffix) > > All right. But it is better added as separate patch. Sure > Perhaps even > requiring that not only there is at least one of common prefix or common > suffix, but at least one of them is not whitespace only could be put > in a separate commit... > > > > > > > +sub format_rem_add_line { > > > > > > + my ($rem, $add) = @_; > > > > > > + my @r = split(//, $rem); > > > > > > + my @a = split(//, $add); > > > > > > BTW the name of variable can be just @add and @rem. > > > > > > > I know they are different scopes but I don't like it. It makes the code > > more confusing IMO. But I won't insist. > > In my opinion if the variable refers to the same entity in different > forms, using @foo and %foo (used in gitweb), or $foo and @foo (could be > used here) is all right, and even better than trying to come up with > different name for the same thing because of sigil. OK. I'll follow that convention then. > > > > > > Shouldn't > > > > > $prefix / $prefix_len start from 0, not from 1? > > > > > > > > It starts from 1 because it skips first +/-. It should become obvious > > > > after reading the comment from last patch :). > > This means that $prefix is true even if prefix is empty ($prefix == 1). > Wouldn't it be better for $prefix_len to count length of true prefix, > without diff adornment? Or make @r / @rem skip initial characters... > > > > > + # In combined diff we must ignore two +/- characters. > > > > + $prefix = 2 if ($is_combined); > > > > > > Anyway comment about that fact would be nice. > > > > Will do. > > BTW. it is not "2" but "scalar @{$co{'parents'}}". > OK. > > > > > The splitting and comparing by characters is taken from diff-highlight. > > > > I don't think it's worth changing here. > > > > > > You are right. > > > > > > I'll try to come with hacky algorithm using string bitwise xor and regexp, > > > and benchmark it comparing to your C-like solution, but it can be left for > > > later (simple is better than clever, usually). > > > > If you have time :). > > Anyway it would be separate commit. Better to just copy tested code > from contrib/diff-highlight > > BTW. would "git blame -C -C -C -w" detect this correctly as code > movement^W copying? > Cannot say. Have you considered what I wrote in a separate e-mail, about using diff-highlight output directly / as a library? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html